Perceptual Video Coding for Machines via Satisfied Machine Ratio Modeling

Video Coding for Machines (VCM) aims to compress visual signals for machine analysis. However, existing methods only consider a few machines, neglecting the majority. Moreover, the machine's perceptual characteristics are not leveraged effectively, resulting in suboptimal compression efficiency. To overcome these limitations, this paper introduces Satisfied Machine Ratio (SMR), a metric that statistically evaluates the perceptual quality of compressed images and videos for machines by aggregating satisfaction scores from them. Each score is derived from machine perceptual differences between original and compressed images. Targeting image classification and object detection tasks, we build two representative machine libraries for SMR annotation and create a large-scale SMR dataset to facilitate SMR studies. We then propose an SMR prediction model based on the correlation between deep feature differences and SMR. Furthermore, we introduce an auxiliary task to increase the prediction accuracy by predicting the SMR difference between two images in different quality. Extensive experiments demonstrate that SMR models significantly improve compression performance for machines and exhibit robust generalizability on unseen machines, codecs, datasets, and frame types. SMR enables perceptual coding for machines and propels VCM from specificity to generality. Code is available at https://github.com/ywwynm/SMR.

翻译：面向机器的视频编码（VCM）旨在压缩视觉信号以供机器分析。然而，现有方法仅考虑少数机器，忽略了大多数机器。此外，机器的感知特性未被有效利用，导致压缩效率欠佳。为克服这些局限，本文引入满意机器比（SMR）——一种通过聚合机器对压缩图像/视频的满意度评分来统计评估其感知质量的指标。每个评分源于原始图像与压缩图像之间的机器感知差异。针对图像分类与目标检测任务，我们构建了两个代表性机器库用于SMR标注，并创建大规模SMR数据集以促进相关研究。随后，基于深度特征差异与SMR之间的相关性，提出SMR预测模型。此外，我们引入辅助任务，通过预测不同质量图像间的SMR差异来提升预测精度。大量实验表明，SMR模型显著提升了面向机器的压缩性能，并在未见机器、编解码器、数据集及帧类型上展现出强泛化能力。SMR实现了面向机器的感知编码，推动VCM从特例性走向通用性。代码见https://github.com/ywwynm/SMR。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日