Conventional communication systems, including both separation-based coding and AI-driven joint source-channel coding (JSCC), are largely guided by Shannon's rate-distortion theory. However, relying on generic distortion metrics fails to capture complex human visual perception, often resulting in blurred or unrealistic reconstructions. In this paper, we propose Joint Source-Channel-Generation Coding (JSCGC), a novel paradigm that shifts the focus from deterministic reconstruction to probabilistic generation. JSCGC leverages a generative model at the receiver as a generator rather than a conventional decoder to parameterize the data distribution, enabling direct maximization of mutual information under channel constraints while controlling stochastic sampling to produce outputs residing on the authentic data manifold with high fidelity. We further derive a theoretical lower bound on the maximum semantic inconsistency with given transmitted mutual information, elucidating the fundamental limits of communication in controlling the generative process. Extensive experiments on image transmission demonstrate that JSCGC substantially improves perceptual quality and semantic fidelity, significantly outperforming conventional distortion-oriented JSCC methods.
翻译:传统的通信系统,包括基于分离的编码和人工智能驱动的联合源信道编码(JSCC),主要遵循香农的率失真理论。然而,依赖通用的失真度量无法捕捉复杂的人类视觉感知,常常导致重建图像模糊或不真实。本文提出联合源-信道-生成编码(JSCGC),这是一种新颖的范式,将重点从确定性重建转向概率性生成。JSCGC在接收端利用生成模型作为生成器,而非传统的解码器,以参数化数据分布,从而能够在信道约束下直接最大化互信息,同时控制随机采样以产生位于真实数据流形上的高保真输出。我们进一步推导了在给定传输互信息下最大语义不一致的理论下界,阐明了在控制生成过程中通信的基本极限。在图像传输上的大量实验表明,JSCGC显著提升了感知质量和语义保真度,明显优于传统的面向失真的JSCC方法。