Image coding for machines (ICM) aims at reducing the bitrate required to represent an image while minimizing the drop in machine vision analysis accuracy. In many use cases, such as surveillance, it is also important that the visual quality is not drastically deteriorated by the compression process. Recent works on using neural network (NN) based ICM codecs have shown significant coding gains against traditional methods; however, the decompressed images, especially at low bitrates, often contain checkerboard artifacts. We propose an effective decoder finetuning scheme based on adversarial training to significantly enhance the visual quality of ICM codecs, while preserving the machine analysis accuracy, without adding extra bitcost or parameters at the inference phase. The results show complete removal of the checkerboard artifacts at the negligible cost of -1.6% relative change in task performance score. In the cases where some amount of artifacts is tolerable, such as when machine consumption is the primary target, this technique can enhance both pixel-fidelity and feature-fidelity scores without losing task performance.
翻译:面向机器的图像编码(ICM)旨在降低图像表示所需的比特率,同时最小化机器视觉分析精度的下降。在监控等许多应用场景中,压缩过程不显著劣化视觉质量同样至关重要。基于神经网络(NN)的ICM编解码器的最新研究已展现出超越传统方法的显著编码增益;然而,解压缩后的图像(尤其在低比特率下)常出现棋盘格伪影。本文提出一种基于对抗训练的有效解码器微调方案,在不增加推理阶段额外比特开销或参数的前提下,显著提升ICM编解码器的视觉质量,同时保持机器分析精度。结果表明,该方法能以任务性能得分相对变化-1.6%的可忽略代价完全消除棋盘格伪影。当机器消费为主要目标(即允许存在一定伪影)时,本技术可在不损失任务性能的同时提升像素保真度与特征保真度评分。