Feature Coding for Machines (FCM) aims to compress intermediate features effectively for remote intelligent analytics, which is crucial for future intelligent visual applications. In this paper, we propose a Multiscale Feature Importance-based Bit Allocation (MFIBA) for end-to-end FCM. First, we find that the importance of features for machine vision tasks varies with the scales, object size, and image instances. Based on this finding, we propose a Multiscale Feature Importance Prediction (MFIP) module to predict the importance weight for each scale of features. Secondly, we propose a task loss-rate model to establish the relationship between the task accuracy losses of using compressed features and the bitrate of encoding these features. Finally, we develop a MFIBA for end-to-end FCM, which is able to assign coding bits of multiscale features more reasonably based on their importance. Experimental results demonstrate that when combined with a retained Efficient Learned Image Compression (ELIC), the proposed MFIBA achieves an average of 38.202% bitrate savings in object detection compared to the anchor ELIC. Moreover, the proposed MFIBA achieves an average of 17.212% and 36.492% feature bitrate savings for instance segmentation and keypoint detection, respectively. When the proposed MFIBA is applied to the LIC-TCM, it achieves an average of 18.103%, 19.866% and 19.597% bit rate savings on three machine vision tasks, respectively, which validates the proposed MFIBA has good generalizability and adaptability to different machine vision tasks and FCM base codecs.
翻译:机器特征编码旨在为远程智能分析高效压缩中间特征,这对未来智能视觉应用至关重要。本文提出了一种基于多尺度特征重要性的端到端机器特征编码比特分配方法。首先,我们发现特征对机器视觉任务的重要性随尺度、物体大小和图像实例而变化。基于此发现,我们提出了多尺度特征重要性预测模块,用于预测各尺度特征的重要性权重。其次,我们提出了任务损失-码率模型,以建立使用压缩特征导致的任务精度损失与特征编码比特率之间的关系。最后,我们开发了用于端到端机器特征编码的多尺度特征重要性比特分配方法,能够根据特征重要性更合理地分配多尺度特征的编码比特。实验结果表明,当与保留的Efficient Learned Image Compression结合时,所提出的方法在目标检测任务中相比基准ELIC平均节省38.202%的码率。此外,该方法在实例分割和关键点检测任务中分别平均节省17.212%和36.492%的特征码率。当应用于LIC-TCM时,该方法在三个机器视觉任务上分别平均节省18.103%、19.866%和19.597%的码率,验证了所提方法对不同机器视觉任务和机器特征编码基础编解码器具有良好的泛化性和适应性。