Language-Oriented Semantic Latent Representation for Image Transmission

In the new paradigm of semantic communication (SC), the focus is on delivering meanings behind bits by extracting semantic information from raw data. Recent advances in data-to-text models facilitate language-oriented SC, particularly for text-transformed image communication via image-to-text (I2T) encoding and text-to-image (T2I) decoding. However, although semantically aligned, the text is too coarse to precisely capture sophisticated visual features such as spatial locations, color, and texture, incurring a significant perceptual difference between intended and reconstructed images. To address this limitation, in this paper, we propose a novel language-oriented SC framework that communicates both text and a compressed image embedding and combines them using a latent diffusion model to reconstruct the intended image. Experimental results validate the potential of our approach, which transmits only 2.09\% of the original image size while achieving higher perceptual similarities in noisy communication channels compared to a baseline SC method that communicates only through text.The code is available at https://github.com/ispamm/Img2Img-SC/ .

翻译：在语义通信（SC）的新范式中，重点是通过从原始数据中提取语义信息来传递比特背后的含义。数据到文本模型的近期进展促进了面向语言的语义通信，特别是通过图像到文本（I2T）编码和文本到图像（T2I）解码实现的文本转换图像通信。然而，尽管语义对齐，文本过于粗糙，难以精确捕捉空间位置、颜色和纹理等复杂视觉特征，导致预期图像与重建图像之间存在显著的感知差异。为解决这一局限，本文提出一种新颖的面向语言语义通信框架，该框架同时传输文本和压缩图像嵌入，并利用潜在扩散模型将两者结合以重建预期图像。实验结果验证了该方法的潜力，在噪声通信信道中，仅传输原始图像大小的2.09%，即可实现比仅通过文本通信的基线语义通信方法更高的感知相似度。代码已开源：https://github.com/ispamm/Img2Img-SC/。

相关内容

关注 0

SC：International Conference for High Performance Computing, Networking, Storage, and Analysis。 Explanation：高性能计算、网络、存储和分析国际会议。 Publisher：IEEE。 SIT: http://dblp.uni-trier.de/db/conf/sc/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日