Semantic-preserving image coding based on Conditional Diffusion models

Semantic communication, rather than on a bit-by-bit recovery of the transmitted messages, focuses on the meaning and the goal of the communication itself. In this paper, we propose a novel semantic image coding scheme that preserves the semantic content of an image, while ensuring a good trade-off between coding rate and image quality. The proposed Semantic-Preserving Image Coding based on Conditional Diffusion Models (SPIC) transmitter encodes a Semantic Segmentation Map (SSM) and a low-resolution version of the image to be transmitted. The receiver then reconstructs a high-resolution image using a Denoising Diffusion Probabilistic Models (DDPM) doubly conditioned to the SSM and the low-resolution image. As shown by the numerical examples, compared to state-of-the-art (SOTA) approaches, the proposed SPIC exhibits a better balance between the conventional rate-distortion trade-off and the preservation of semantically-relevant features.

翻译：语义通信关注的是通信本身的含义和目标，而非对传输消息的逐比特恢复。本文提出一种新颖的语义图像编码方案，该方案在保持图像语义内容的同时，确保了编码速率与图像质量之间的良好平衡。所提出的基于条件扩散模型的语义保持图像编码(SPIC)方案中，发射端编码语义分割图(SSM)和待传输图像的低分辨率版本，接收端则利用双重条件约束于SSM和低分辨率图像的去噪扩散概率模型(DDPM)重建高分辨率图像。数值示例表明，与现有最优(SOTA)方法相比，所提SPIC方案在传统率失真权衡与语义相关特征保持之间展现出更优的平衡性能。

相关内容

SPIC

关注 0

信号处理：图像通信（SPIC）期刊发表有关图像通信系统设计，实现和使用方面的文章。该期刊的特色是原始研究工作，教程和评论文章以及实际发展情况。感兴趣的主题包括图像/视频编码，3D视频表示和压缩，3D图形和动画压缩，HDTV和3DTV系统，视频适配，基于IP的视频，对等视频网络，交互式视觉通信，多用户视频会议，无线视频广播和通信，视觉监控，2D和3D图像/视频质量度量，预处理/后处理，视频恢复和超分辨率，多摄像机视频分析，运动分析，基于内容的图像/视频索引和检索，面部和手势处理，视频合成，2D和3D图像/视频采集和显示技术，图像/视频处理和通信的体系结构。官网地址：http://dblp.uni-trier.de/db/journals/spic/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日