Vector quantization-based image semantic communication systems have successfully boosted transmission efficiency, but face a challenge with conflicting requirements between codebook design and digital constellation modulation. Traditional codebooks need a wide index range, while modulation favors few discrete states. To address this, we propose a multilevel generative semantic communication system with a two-stage training framework. In the first stage, we train a high-quality codebook, using a multi-head octonary codebook (MOC) to compress the index range. We also integrate a residual vector quantization (RVQ) mechanism for effective multilevel communication. In the second stage, a noise reduction block (NRB) based on Swin Transformer is introduced, coupled with the multilevel codebook from the first stage, serving as a high-quality semantic knowledge base (SKB) for generative feature restoration. Experimental results highlight MOC-RVQ's superior performance over methods like BPG or JPEG, even without channel error correction coding.
翻译:基于向量量化的图像语义通信系统已成功提升了传输效率,但面临码本设计与数字星座调制之间需求冲突的挑战。传统码本需要广泛的索引范围,而调制则偏好少数离散状态。为解决此问题,我们提出了一种采用两阶段训练框架的多级生成式语义通信系统。在第一阶段,我们训练高质量码本,利用多头八进制码本(MOC)压缩索引范围,并集成残差向量量化(RVQ)机制以实现高效多级通信。在第二阶段,引入基于Swin Transformer的降噪模块(NRB),结合第一阶段的多级码本作为高质量语义知识库(SKB),用于生成式特征恢复。实验结果表明,即使无信道纠错编码,MOC-RVQ在性能上仍优于BPG或JPEG等方法。