ResiComp: Loss-Resilient Image Compression via Dual-Functional Masked Visual Token Modeling

Recent advancements in neural image codecs (NICs) are of significant compression performance, but limited attention has been paid to their error resilience. These resulting NICs tend to be sensitive to packet losses, which are prevalent in real-time communications. In this paper, we investigate how to elevate the resilience ability of NICs to combat packet losses. We propose ResiComp, a pioneering neural image compression framework with feature-domain packet loss concealment (PLC). Motivated by the inherent consistency between generation and compression, we advocate merging the tasks of entropy modeling and PLC into a unified framework focused on latent space context modeling. To this end, we take inspiration from the impressive generative capabilities of large language models (LLMs), particularly the recent advances of masked visual token modeling (MVTM). During training, we integrate MVTM to mirror the effects of packet loss, enabling a dual-functional Transformer to restore the masked latents by predicting their missing values and conditional probability mass functions. Our ResiComp jointly optimizes compression efficiency and loss resilience. Moreover, ResiComp provides flexible coding modes, allowing for explicitly adjusting the efficiency-resilience trade-off in response to varying Internet or wireless network conditions. Extensive experiments demonstrate that ResiComp can significantly enhance the NIC's resilience against packet losses, while exhibits a worthy trade-off between compression efficiency and packet loss resilience.

翻译：近年来，神经图像编解码器（NICs）在压缩性能方面取得了显著进展，但其抗误码能力却未得到充分关注。现有NICs普遍对数据包丢失较为敏感，而包丢失在实时通信中极为常见。本文研究如何提升NICs的抗包丢失能力。我们提出ResiComp——首个具备特征域包丢失隐藏（PLC）功能的神经图像压缩框架。受生成与压缩任务内在一致性的启发，我们主张将熵建模与PLC任务整合到专注于潜在空间上下文建模的统一框架中。为此，我们从大语言模型（LLMs）强大的生成能力中获得灵感，特别是近期掩码视觉标记建模（MVTM）的进展。在训练过程中，我们引入MVTM来模拟包丢失效应，使双功能Transformer能够通过预测缺失值及其条件概率质量函数来恢复被掩码的潜在表示。ResiComp可同步优化压缩效率与抗损能力。此外，该框架提供灵活的编码模式，允许根据互联网或无线网络状况的变化，显式调整效率与抗损性的权衡关系。大量实验表明，ResiComp能显著增强NICs的抗包丢失能力，同时在压缩效率与包丢失恢复性能之间实现了有价值的权衡。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日