Nowadays, deep-learning image coding solutions have shown similar or better compression efficiency than conventional solutions based on hand-crafted transforms and spatial prediction techniques. These deep-learning codecs require a large training set of images and a training methodology to obtain a suitable model (set of parameters) for efficient compression. The training is performed with an optimization algorithm which provides a way to minimize the loss function. Therefore, the loss function plays a key role in the overall performance and includes a differentiable quality metric that attempts to mimic human perception. The main objective of this paper is to study the perceptual impact of several image quality metrics that can be used in the loss function of the training process, through a crowdsourcing subjective image quality assessment study. From this study, it is possible to conclude that the choice of the quality metric is critical for the perceptual performance of the deep-learning codec and that can vary depending on the image content.
翻译:目前,基于深度学习方法的图像编码方案在压缩效率上已展现出与传统基于手工变换和空间预测技术的方案相当甚至更优的性能。此类深度学习编解码器需要大规模图像训练集及相应的训练方法,以获得适用于高效压缩的模型(参数集)。训练过程采用优化算法,通过最小化损失函数来实现。因此,损失函数在整体性能中起到关键作用,其通常包含一个可微分的质量度量,旨在模拟人类视觉感知。本文旨在通过众包主观图像质量评估研究,探讨训练过程中可用于损失函数的若干图像质量度量对感知性能的影响。研究结果表明,质量度量的选择对于深度学习编解码器的感知性能至关重要,且该影响可能因图像内容的不同而有所差异。