Deep learning (DL)-based Semantic Communications (SemCom) is becoming critical to maximize overall efficiency of communication networks. Nevertheless, SemCom is sensitive to wireless channel uncertainties, source outliers, and suffer from poor generalization bottlenecks. To address the mentioned challenges, this paper develops a latent diffusion model-enabled SemCom system with three key contributions, i.e., i) to handle potential outliers in the source data, semantic errors obtained by projected gradient descent based on the vulnerabilities of DL models, are utilized to update the parameters and obtain an outlier-robust encoder, ii) a lightweight single-layer latent space transformation adapter completes one-shot learning at the transmitter and is placed before the decoder at the receiver, enabling adaptation for out-of-distribution data and enhancing human-perceptual quality, and iii) an end-to-end consistency distillation (EECD) strategy is used to distill the diffusion models trained in latent space, enabling deterministic single or few-step low-latency denoising in various noisy channels while maintaining high semantic quality. Extensive numerical experiments across different datasets demonstrate the superiority of the proposed SemCom system, consistently proving its robustness to outliers, the capability to transmit data with unknown distributions, and the ability to perform real-time channel denoising tasks while preserving high human perceptual quality, outperforming the existing denoising approaches in semantic metrics like learned perceptual image path similarity (LPIPS).
翻译:基于深度学习(DL)的语义通信(SemCom)对于最大化通信网络的整体效率正变得至关重要。然而,SemCom对无线信道不确定性、信源异常值敏感,并存在泛化能力差的瓶颈。为应对上述挑战,本文开发了一种潜在扩散模型驱动的SemCom系统,其具有三个关键贡献:i) 为处理信源数据中潜在的异常值,利用基于DL模型脆弱性通过投影梯度下降获得的语义误差来更新参数,从而得到一个对异常值鲁棒的编码器;ii) 一个轻量级的单层潜在空间变换适配器在发射端完成一次性学习,并置于接收端解码器之前,能够适应分布外数据并提升人类感知质量;iii) 采用端到端一致性蒸馏(EECD)策略来蒸馏在潜在空间中训练的扩散模型,从而能够在各种噪声信道中实现确定性的单步或少步低延迟去噪,同时保持高语义质量。在不同数据集上进行的大量数值实验证明了所提出的SemCom系统的优越性,一致地验证了其对异常值的鲁棒性、传输未知分布数据的能力,以及在保持高人类感知质量的同时执行实时信道去噪任务的能力,在诸如学习感知图像路径相似度(LPIPS)等语义指标上优于现有的去噪方法。