The Theory of Group Operating Characteristic Analysis in Discrimination Tasks

Vit Drga

1999

A thesis submitted in fulfillment of the requirements for the degree of Doctor of Philosophy in Psychology.
Victoria University of Wellington, New Zealand.

Overview

My thesis had two main parts to it: Part I was about the theory of GOC analysis. Part II was about practical extensions of GOC analysis. To get an overview of the thesis, read the separate prefaces for Part I and Part II. In retrospect, the prefaces together read better than the thesis abstract.

Part I: Group Operating Characteristic Analysis

Part I is concerned with the effects of observer inconsistency in discrimination tasks, and how the effects can be removed within the context of the Theory of Signal Detectability (TSD). Chapter 1 gives an overview of TSD, with details of methodologies and measures of performance that are used in later chapters. Chapter 2 describes models of observer inconsistency, and also mean receiver operating characteristic analysis and group operating characteristic (GOC) analysis as means for removing variability due to inconsistency. Chapter 3 describes transform-average GOC analysis, which is a generalisation of GOC analysis that encompasses generalised mean ratings and arbitrary ordinal scaling of a rating scale. Chapter 4 introduces the transfer function, which relates values on a decision axis to values on a rating scale and shows how a transfer function can be estimated from data. Chapter 5 provides a theory of GOC analysis that incorporates the developments of previous chapters within a single framework. Stochastic ordering is shown to be the key statistical property needed in order for GOC analysis to remove the effects of inconsistency from experimental data. If stochastic ordering holds, then GOC analysis works for arbitrary transfer functions and arbitrary scalings of a rating scale.

Part II: Functions Of Replications Added

For a multiple-replication data set, GOC analysis may be used to minimise unique noise effects and improve performance in a discrimination task. As more replications are combined, performance improves as a function of replications added (FORA). Stable empirical FORAs result from all combinations analysis (ACA), where average performance is calculated over all possible GOC curves for a given number of replications. A widely applicable FORA regression function is introduced. Extrapolation of this function to an infinite number of replications makes it possible to estimate asymptotic unique-noise-free performance, based on a finite data set. Chapter 6 introduces a FORA regression procedure, which is able to estimate known theoretical performance to better than two decimal places. Chapter 7 applies FORA regression to an amplitude discrimination experiment in which 100 replications were run. The very large data set makes it possible to not only estimate asymptotic performance, but to estimate sample statistics and error bounds of the asymptote. Chapter 8 shows FORA results for four sets of experiments on frequency discrimination and amplitude discrimination. FORA regression is shown to be very robust across experimental paradigms, observers, types of stimuli, stimulus parameters, performance levels and measures of sensitivity. Chapter 9 is a summary chapter.

Abstract

Inconsistent decision making is a long-standing problem in psychophysics, where decisions based on the same stimulus often differ across replications of an experiment. Inconsistency is described statistically by the concept of unique noise, the effects of which are removed by averaging ratings across replications on a per-stimulus basis. A group operating characteristic (GOC) curve is a type of receiver operating characteristic (ROC) curve based on the mean rating per stimulus. GOC analysis is shown to improve task performance dramatically compared to ROC analysis, and can recover theoretical ROC curves from noisy data. This thesis presents a theory of GOC analysis showing why the procedure works. It also develops transform-average GOC analysis, transfer function analysis, and shows how to estimate unique-noise-free performance from a finite, unique-noise-affected data set.

Transform-averaging of ratings (for example, by using geometric or harmonic means) extends GOC analysis to include strictly monotonic increasing (s.m.i.) transformations of rating scale data. Although s.m.i. transforms do not alter ROC curves on any single replication, it is shown that they do alter GOC curves because of unique noise. Nevertheless, GOC analysis may be transform-invariant, apart from residual unique noise effects. Empirical evidence is given showing how GOC performance improves towards theoretical performance regardless of the particular rating scale that is involved.

A psychophysical transfer function is an s.m.i. mapping from a decision axis onto a rating scale. Transfer functions underlie theoretical interpretation of empirical ROC analysis, and it is shown how they can be estimated from empirical data. The theory of GOC analysis incorporates both transfer functions and transform-average GOC analysis under the same framework. The theory shows that GOC analysis will work under arbitrary (and possibly unknown) transfer functions, and under arbitrary ordinal scalings of a rating scale, but only when a family of unique-noise-affected evidence distributions are stochastically ordered on the decision axis. If stochastic ordering does not hold, unique-noise-free GOC performance changes according to the scaling of a rating scale. When that is the case, empirical results and subsequent theoretical interpretation become somewhat arbitrary. This finding about unique-noise-affected rating scales also extends to theoretical models that incorporate unique noise. Without stochastic ordering on a decision axis, the theoretical unique-noise-free ROC curve can change following an s.m.i. transform of the decision axis.

GOC performance improves as a function of replications added (FORA). Stable empirical FORAs result from all combinations analysis (ACA), where average performance is calculated over all possible GOC curves for a given number of replications. The logarithm of FORA increments is generally a linear function of the logarithm of the number of replications, typically with r²>0.995. This pattern implies a three-parameter data model that provided an excellent description of FORAs from six different experimental projects. These projects involved different aural discrimination tasks, experimental paradigms, decision methodologies, individual observers, levels of performance, stimulus parameters, and measures of sensitivity. Dozens of different FORAs followed the same mathematical form - only the three parameters of the data model changed.

Extrapolation of a FORA to an infinite number of replications makes it possible to estimate asymptotic unique-noise-free performance and its sample statistics based on a finite data set. Empirical FORA analysis showed that the observer with the best (unique-noise-affected) ROC performance was often not the observer with the best unique-noise-free performance. This shows that unique noise can generate deceptive results in psychophysics, but that its effects can be removed by using GOC analysis.

Download

Download in pdf format (3.5 Mb)
Download in zipped postscript format (1.4 Mb).
[Download the gsview postscript viewer]

Last updated 08 Nov 2009 04:37 PM