The two-alternative forced choice (2AFC) experimental method is popular in the visual perception literature, where practitioners aim to understand how human observers perceive distances within triplets made of a reference image and two distorted versions. In the past, this had been conducted in controlled environments, with triplets sharing images, so it was possible to rank the perceived quality. This ranking would then be used to evaluate perceptual distance models against the experimental data. Recently, crowd-sourced perceptual datasets have emerged, with no images shared between triplets, making ranking infeasible. Evaluating perceptual distance models using this data reduces the judgements on a triplet to a binary decision, namely, whether the distance model agrees with the human decision - which is suboptimal and prone to misleading conclusions. Instead, we statistically model the underlying decision-making process during 2AFC experiments using a binomial distribution. Having enough empirical data, we estimate a smooth and consistent distribution of the judgements on the reference-distorted distance plane, according to each distance model. By applying maximum likelihood, we estimate the parameter of the local binomial distribution, and a global measurement of the expected log-likelihood of the measured responses. We calculate meaningful and well-founded metrics for the distance model, beyond the mere prediction accuracy as percentage agreement, even with variable numbers of judgements per triplet -- key advantages over both classical and neural network methods.
翻译:双选项强制选择(2AFC)实验方法在视觉感知研究中被广泛采用,研究者旨在通过包含参考图像及其两种失真版本的图像三元组来理解人类观察者的感知距离判断。传统上,此类实验在受控环境中进行,且三元组间共享图像,因而能够对感知质量进行排序。这种排序结果随后被用于根据实验数据评估感知距离模型。近年来,出现了众包感知数据集,其三元组间不存在共享图像,使得排序方法不再适用。使用此类数据评估感知距离模型时,需将三元组的判断简化为二元决策——即判断距离模型是否与人类决策一致——这种方法存在缺陷,容易导致误导性结论。为此,我们采用二项分布对2AFC实验中的底层决策过程进行统计建模。基于充足的实证数据,我们根据每个距离模型,在参考-失真距离平面上估计出平滑且一致的判断分布。通过极大似然估计,我们计算出局部二项分布的参数,以及对测量响应的期望对数似然全局度量。即使在三元组判断数量可变的情况下,我们仍能计算出具有明确意义和理论依据的距离模型度量指标,其价值远超简单的预测准确率(即一致百分比)——这相较于传统方法和神经网络方法具有关键优势。