Subjective assessment tests are often employed to evaluate image processing systems, notably image and video compression, super-resolution among others and have been used as an indisputable way to provide evidence of the performance of an algorithm or system. While several methodologies can be used in a subjective quality assessment test, pairwise comparison tests are nowadays attracting a lot of attention due to their accuracy and simplicity. However, the number of comparisons in a pairwise comparison test increases quadratically with the number of stimuli and thus often leads to very long tests, which is impractical for many cases. However, not all the pairs contribute equally to the final score and thus, it is possible to reduce the number of comparisons without degrading the final accuracy. To do so, pairwise sampling methods are often used to select the pairs which provide more information about the quality of each stimuli. In this paper, a reliable and much-needed evaluation procedure is proposed and used for already available methods in the literature, especially considering the case of subjective evaluation of image and video codecs. The results indicate that an appropriate selection of the pairs allows to achieve very reliable scores while requiring the comparison of a much lower number of pairs.
翻译:主观评估测试常被用于评价图像处理系统,尤其是图像与视频压缩、超分辨率等领域,并已成为无可争议地证明算法或系统性能的途径。尽管在主观质量评估测试中可采用多种方法,但成对比较测试因其准确性和简洁性而日益受到关注。然而,成对比较测试中的比较次数会随刺激数量呈二次方增长,导致测试时间过长,在多数场景下缺乏实用性。不过,并非所有配对都对最终评分有同等贡献,因此可以在不降低最终准确性的前提下减少比较次数。为此,常采用成对采样方法选择能提供更多刺激质量信息的配对。本文针对文献中已有的方法,特别是图像与视频编解码器主观评价场景,提出了一种可靠且亟需的评估流程并付诸实践。结果表明,合理选择配对能在显著减少比较次数的同时获得高可靠性的评分。