This work introduces Semantically Masked Vector Quantized Generative Adversarial Network (SQ-GAN), a novel approach integrating semantically driven image coding and vector quantization to optimize image compression for semantic/task-oriented communications. The method only acts on source coding and is fully compliant with legacy systems. The semantics is extracted from the image computing its semantic segmentation map using off-the-shelf software. A new specifically developed semantic-conditioned adaptive mask module (SAMM) selectively encodes semantically relevant features of the image. The relevance of the different semantic classes is task-specific, and it is incorporated in the training phase by introducing appropriate weights in the loss function. SQ-GAN outperforms state-of-the-art image compression schemes such as JPEG2000, BPG, and deep-learning based methods across multiple metrics, including perceptual quality and semantic segmentation accuracy on the reconstructed image, at extremely low compression rates.
翻译:本文提出了一种语义驱动的掩码向量量化生成对抗网络(SQ-GAN),该方法将语义驱动的图像编码与向量量化相结合,以优化面向语义/任务的通信中的图像压缩。该方法仅作用于信源编码,且完全兼容现有系统。语义信息通过现成软件从图像中计算其语义分割图来提取。新开发的语义条件自适应掩码模块(SAMM)选择性地编码图像的语义相关特征。不同语义类别的重要性与具体任务相关,通过在损失函数中引入相应权重,该重要性被纳入训练阶段。在极低压缩率下,SQ-GAN在多项指标上均优于现有图像压缩方案(如JPEG2000、BPG及基于深度学习的方法),包括重建图像的感知质量与语义分割精度。