Lightweight and efficient deep joint source-channel coding (JSCC) is a key technology for semantic communications. In this paper, we design a novel JSCC scheme named MambaJSCC, which utilizes a visual state space model with channel adaptation (VSSM-CA) block as its backbone for transmitting images over wireless channels. The VSSM-CA block utilizes VSSM to integrate two-dimensional images with the state space, enabling feature extraction and encoding processes to operate with linear complexity. It also incorporates channel state information (CSI) via a newly proposed CSI embedding method. This method deploys a shared CSI encoding module within both the encoder and decoder to encode and inject the CSI into each VSSM-CA block, improving the adaptability of a single model to varying channel conditions. Experimental results show that MambaJSCC not only outperforms Swin Transformer based JSCC (SwinJSCC) but also significantly reduces parameter size, computational overhead, and inference delay (ID). For example, with employing an equal number of the VSSM-CA blocks and the Swin Transformer blocks, MambaJSCC achieves a 0.48 dB gain in peak-signal-to-noise ratio (PSNR) over SwinJSCC while requiring only 53.3% multiply-accumulate operations, 53.8% of the parameters, and 44.9% of ID.
翻译:轻量高效的深度联合信源信道编码(JSCC)是语义通信的关键技术。本文设计了一种名为MambaJSCC的新型JSCC方案,该方案以具备信道自适应能力的视觉状态空间模型(VSSM-CA)模块为骨干网络,用于在无线信道上传输图像。VSSM-CA模块利用VSSM将二维图像与状态空间进行集成,使特征提取与编码过程能够以线性复杂度运行。该模块还通过新提出的信道状态信息(CSI)嵌入方法融合CSI信息:该方法在编码器和解码器中部署共享的CSI编码模块,将CSI编码后注入每个VSSM-CA模块,从而提升单一模型对不同信道条件的适应性。实验结果表明,MambaJSCC不仅性能优于基于Swin Transformer的JSCC(SwinJSCC),还显著降低了参数量、计算开销和推理延迟(ID)。例如,在采用相同数量的VSSM-CA模块与Swin Transformer模块时,MambaJSCC相比SwinJSCC在峰值信噪比(PSNR)上获得0.48 dB增益,而所需乘累加操作仅为53.3%、参数量为53.8%、推理延迟为44.9%。