Semantic communications provide significant performance gains over traditional communications by transmitting task-relevant semantic features through wireless channels. However, most existing studies rely on end-to-end (E2E) training of neural-type encoders and decoders to ensure effective transmission of these semantic features. To enable semantic communications without relying on E2E training, this paper presents a vision transformer (ViT)-based semantic communication system with importance-aware quantization (IAQ) for wireless image transmission. The core idea of the presented system is to leverage the attention scores of a pretrained ViT model to quantify the importance levels of image patches. Based on this idea, our IAQ framework assigns different quantization bits to image patches based on their importance levels. This is achieved by formulating a weighted quantization error minimization problem, where the weight is set to be an increasing function of the attention score. Then, an optimal incremental allocation method and a low-complexity water-filling method are devised to solve the formulated problem. Our framework is further extended for realistic digital communication systems by modifying the bit allocation problem and the corresponding allocation methods based on an equivalent binary symmetric channel (BSC) model. Simulations on single-view and multi-view image classification tasks show that our IAQ framework outperforms conventional image compression methods in both error-free and realistic communication scenarios.
翻译:语义通信通过无线信道传输任务相关的语义特征,相比传统通信方式可获得显著的性能增益。然而,现有研究大多依赖神经型编码器与解码器的端到端(E2E)训练来保证语义特征的有效传输。为实现不依赖端到端训练的语义通信,本文提出一种基于视觉Transformer(ViT)并采用重要性感知量化(IAQ)的无线图像传输语义通信系统。该系统的核心思想是利用预训练ViT模型的注意力分数来量化图像块的重要性程度。基于这一思想,我们的IAQ框架根据图像块的重要性程度为其分配不同的量化比特数。这通过构建加权量化误差最小化问题实现,其中权重设置为注意力分数的递增函数。随后,我们设计了最优增量分配方法与低复杂度注水算法以求解该问题。通过基于等效二进制对称信道(BSC)模型修正比特分配问题及相应分配方法,本框架进一步扩展至实际数字通信系统。在单视角与多视角图像分类任务上的仿真实验表明,无论在无差错还是实际通信场景下,我们的IAQ框架均优于传统图像压缩方法。