Semantic communication has undergone considerable evolution due to the recent rapid development of artificial intelligence (AI), significantly enhancing both communication robustness and efficiency. Despite these advancements, most current semantic communication methods for image transmission pay little attention to the differing importance of objects and backgrounds in images. To address this issue, we propose a novel scheme named ASCViT-JSCC, which utilizes vision transformers (ViTs) integrated with an orthogonal frequency division multiplexing (OFDM) system. This scheme adaptively allocates bandwidth for objects and backgrounds in images according to the importance order of different parts determined by object detection of you only look once version 5 (YOLOv5) and feature points detection of scale invariant feature transform (SIFT). Furthermore, the proposed scheme adheres to digital modulation standards by incorporating quantization modules. We validate this approach through an over-the-air (OTA) testbed named intelligent communication prototype validation platform (ICP) based on a software-defined radio (SDR) and NVIDIA embedded kits. Our findings from both simulations and practical measurements show that ASCViT-JSCC significantly preserves objects in images and enhances reconstruction quality compared to existing methods.
翻译:由于人工智能(AI)的快速发展,语义通信经历了显著演进,极大提升了通信的鲁棒性与效率。尽管取得了这些进展,当前大多数面向图像传输的语义通信方法很少关注图像中物体与背景的不同重要性。为解决这一问题,我们提出了一种名为ASCViT-JSCC的新方案,该方案利用视觉Transformer(ViT)与正交频分复用(OFDM)系统相结合。该方案根据YOLOv5目标检测与SIFT特征点检测确定的不同区域重要性顺序,自适应地为图像中的物体与背景分配带宽。此外,所提方案通过引入量化模块符合数字调制标准。我们基于软件定义无线电(SDR)与NVIDIA嵌入式套件构建的智能通信原型验证平台(ICP)空中测试平台对该方法进行了验证。仿真与实际测量结果均表明,与现有方法相比,ASCViT-JSCC能显著保留图像中的物体并提升重建质量。