End-to-end visual communication systems typically optimize a trade-off between channel bandwidth costs and signal-level distortion metrics. However, under challenging physical conditions, this traditional discriminative communication paradigm often results in unrealistic reconstructions with perceptible blurring and aliasing artifacts, despite the inclusion of perceptual or adversarial losses for optimizing. This issue primarily stems from the receiver's limited knowledge about the underlying data manifold and the use of deterministic decoding mechanisms. To address these limitations, this paper introduces DiffCom, a novel end-to-end generative communication paradigm that utilizes off-the-shelf generative priors and probabilistic diffusion models for decoding, thereby improving perceptual quality without heavily relying on bandwidth costs and received signal quality. Unlike traditional systems that rely on deterministic decoders optimized solely for distortion metrics, our DiffCom leverages raw channel-received signal as a fine-grained condition to guide stochastic posterior sampling. Our approach ensures that reconstructions remain on the manifold of real data with a novel confirming constraint, enhancing the robustness and reliability of the generated outcomes. Furthermore, DiffCom incorporates a blind posterior sampling technique to address scenarios with unknown forward transmission characteristics. Extensive experimental validations demonstrate that DiffCom not only produces realistic reconstructions with details faithful to the original data but also achieves superior robustness against diverse wireless transmission degradations. Collectively, these advancements establish DiffCom as a new benchmark in designing generative communication systems that offer enhanced robustness and generalization superiorities.
翻译:端到端视觉通信系统通常需要在信道带宽成本与信号级失真度量之间进行权衡优化。然而,在具有挑战性的物理条件下,这种传统的判别式通信范式往往会产生不真实的重建结果,出现可感知的模糊和混叠伪影,尽管系统中已包含用于优化的感知损失或对抗损失。这一问题主要源于接收端对底层数据流形知识的有限性以及确定性解码机制的使用。为应对这些局限性,本文提出DiffCom,一种新颖的端到端生成式通信范式,它利用现成的生成先验和概率扩散模型进行解码,从而在不严重依赖带宽成本与接收信号质量的前提下提升感知质量。与传统系统依赖仅针对失真度量优化的确定性解码器不同,我们的DiffCom利用原始信道接收信号作为细粒度条件来引导随机后验采样。我们的方法通过一种新颖的确认约束确保重建结果保持在真实数据流形上,从而增强了生成结果的鲁棒性与可靠性。此外,DiffCom还引入了一种盲后验采样技术以应对前向传输特性未知的场景。大量实验验证表明,DiffCom不仅能产生细节忠实于原始数据的逼真重建结果,而且对多样化的无线传输退化具有卓越的鲁棒性。总体而言,这些进展使DiffCom成为设计生成式通信系统的新标杆,该系统提供了增强的鲁棒性与泛化优势。