This paper studies the problem of the lightweight image semantic communication system that is deployed on Internet of Things (IoT) devices. In the considered system model, devices must use semantic communication techniques to support user behavior recognition in ultimate video service with high data transmission efficiency. However, it is computationally expensive for IoT devices to deploy semantic codecs due to the complex calculation processes of deep learning (DL) based codec training and inference. To make it affordable for IoT devices to deploy semantic communication systems, we propose an attention-based UNet enabled lightweight image semantic communication (LSSC) system, which achieves low computational complexity and small model size. In particular, we first let the LSSC system train the codec at the edge server to reduce the training computation load on IoT devices. Then, we introduce the convolutional block attention module (CBAM) to extract the image semantic features and decrease the number of downsampling layers thus reducing the floating-point operations (FLOPs). Finally, we experimentally adjust the structure of the codec and find out the optimal number of downsampling layers. Simulation results show that the proposed LSSC system can reduce the semantic codec FLOPs by 14%, and reduce the model size by 55%, with a sacrifice of 3% accuracy, compared to the baseline. Moreover, the proposed scheme can achieve a higher transmission accuracy than the traditional communication scheme in the low channel signal-to-noise (SNR) region.
翻译:本文研究了部署在物联网设备上的轻量化图像语义通信系统问题。在所考虑的系统模型中,设备必须利用语义通信技术来支持终极视频服务中的用户行为识别,同时实现高数据传输效率。然而,由于基于深度学习的编解码器训练和推理涉及复杂计算过程,物联网设备部署语义编解码器面临高昂计算成本。为使物联网设备具备部署语义通信系统的可行性,我们提出了一种基于注意力机制的UNet轻量化图像语义通信系统,该系统实现了低计算复杂度和小模型尺寸。具体而言,首先让轻量化图像语义通信系统在边缘服务器上训练编解码器,以降低物联网设备的训练计算负担。然后引入卷积块注意力模块提取图像语义特征,并减少下采样层数以降低浮点运算次数。最后通过实验调整编解码器结构,找出最优下采样层数。仿真结果表明,与基准方案相比,所提出的轻量化图像语义通信系统可减少语义编解码器14%的浮点运算次数,模型尺寸减少55%,但精度损失3%。此外,在低信道信噪比区域,该方案能实现比传统通信方案更高的传输精度。