Predictive Sampling for Efficient Pairwise Subjective Image Quality Assessment

Subjective image quality assessment studies are used in many scenarios, such as the evaluation of compression, super-resolution, and denoising solutions. Among the available subjective test methodologies, pair comparison is attracting popularity due to its simplicity, reliability, and robustness to changes in the test conditions, e.g. display resolutions. The main problem that impairs its wide acceptance is that the number of pairs to compare by subjects grows quadratically with the number of stimuli that must be considered. Usually, the paired comparison data obtained is fed into an aggregation model to obtain a final score for each degraded image and thus, not every comparison contributes equally to the final quality score. In the past years, several solutions that sample pairs (from all possible combinations) have been proposed, from random sampling to active sampling based on the past subjects' decisions. This paper introduces a novel sampling solution called \textbf{P}redictive \textbf{S}ampling for \textbf{P}airwise \textbf{C}omparison (PS-PC) which exploits the characteristics of the input data to make a prediction of which pairs should be evaluated by subjects. The proposed solution exploits popular machine learning techniques to select the most informative pairs for subjects to evaluate, while for the other remaining pairs, it predicts the subjects' preferences. The experimental results show that PS-PC is the best choice among the available sampling algorithms with higher performance for the same number of pairs. Moreover, since the choice of the pairs is done \emph{a priori} before the subjective test starts, the algorithm is not required to run during the test and thus much more simple to deploy in online crowdsourcing subjective tests.

翻译：主观图像质量评估研究广泛应用于多种场景，例如压缩、超分辨率和去噪解决方案的评估。在现有主观测试方法中，成对比较因其简单性、可靠性以及对测试条件（如显示分辨率）变化的鲁棒性而日益受到青睐。限制其广泛接受的主要问题是，受试者需要比较的成对数量随需考虑的刺激数量呈二次方增长。通常，获取的成对比较数据被输入聚合模型以获得每幅退化图像的最终得分，因此并非每次比较对最终质量分数的贡献相同。近年来，提出了多种从所有可能组合中采样成对的解决方案，从随机采样到基于受试者过往决策的主动采样。本文提出一种称为**预**测**采**样用于**成**对**比**较（PS-PC）的新型采样解决方案，该方案利用输入数据的特征来预测应由受试者评估的成对。所提出的方案利用流行机器学习技术来选择最具信息量的成对供受试者评估，而对于其余成对，则预测受试者的偏好。实验结果表明，在相同成对数量下，PS-PC是现有采样算法中性能最高的最佳选择。此外，由于成对选择在主观测试开始前就已事先完成，该算法无需在测试期间运行，因此在在线众包主观测试中部署更为简便。