CI-ICM: Channel Importance-driven Learned Image Coding for Machines

Traditional human vision-centric image compression methods are suboptimal for machine vision centric compression due to different visual properties and feature characteristics. To address this problem, we propose a Channel Importance-driven learned Image Coding for Machines (CI-ICM), aiming to maximize the performance of machine vision tasks at a given bitrate constraint. First, we propose a Channel Importance Generation (CIG) module to quantify channel importance in machine vision and develop a channel order loss to rank channels in descending order. Second, to properly allocate bitrate among feature channels, we propose a Feature Channel Grouping and Scaling (FCGS) module that non-uniformly groups the feature channels based on their importance and adjusts the dynamic range of each group. Based on FCGS, we further propose a Channel Importance-based Context (CI-CTX) module to allocate bits among feature groups and to preserve higher fidelity in critical channels. Third, to adapt to multiple machine tasks, we propose a Task-Specific Channel Adaptation (TSCA) module to adaptively enhance features for multiple downstream machine tasks. Experimental results on the COCO2017 dataset show that the proposed CI-ICM achieves BD-mAP@50:95 gains of 16.25$\%$ in object detection and 13.72$\%$ in instance segmentation over the established baseline codec. Ablation studies validate the effectiveness of each contribution, and computation complexity analysis reveals the practicability of the CI-ICM. This work establishes feature channel optimization for machine vision-centric compression, bridging the gap between image coding and machine perception.

翻译：传统面向人类视觉的图像压缩方法由于视觉特性和特征表征的差异，难以满足机器视觉压缩的需求。针对此问题，本文提出了一种基于通道重要性的机器视觉图像编码方法（CI-ICM），旨在给定码率约束下最大化机器视觉任务的性能。首先，我们提出通道重要性生成模块（CIG），用于量化机器视觉中的通道重要性，并设计通道排序损失函数以对通道进行降序排列。其次，为合理分配特征通道间的码率，我们提出特征通道分组与缩放模块（FCGS），该模块根据通道重要性对特征通道进行非均匀分组，并调整各组的动态范围。基于FCGS，我们进一步提出通道重要性上下文模块（CI-CTX），用于在特征组间分配码率，并优先保留关键通道的高保真度。再次，为适应多类机器任务，我们提出任务特定通道自适应模块（TSCA），以自适应增强面向下游机器任务的特征。在COCO2017数据集上的实验结果表明，所提CI-ICM方法在目标检测和实例分割任务上，相较于基准编码器分别实现了16.25%和13.72%的BD-mAP@50:95增益。消融实验验证了各贡献模块的有效性，计算复杂度分析揭示了CI-ICM的实用性。本工作建立了面向机器视觉压缩的特征通道优化机制，弥合了图像编码与机器感知之间的鸿沟。