Deep learning-based image compression algorithms typically focus on designing encoding and decoding networks and improving the accuracy of entropy model estimation to enhance the rate-distortion (RD) performance. However, few algorithms leverage the compression distortion prior from existing compression algorithms to improve RD performance. In this paper, we propose a latent diffusion model-based remote sensing image compression (LDM-RSIC) method, which aims to enhance the final decoding quality of RS images by utilizing the generated distortion prior from a LDM. Our approach consists of two stages. In the first stage, a self-encoder learns prior from the high-quality input image. In the second stage, the prior is generated through an LDM, conditioned on the decoded image of an existing learning-based image compression algorithm, to be used as auxiliary information for generating the texture-rich enhanced image. To better utilize the prior, a channel attention and gate-based dynamic feature attention module (DFAM) is embedded into a Transformer-based multi-scale enhancement network (MEN) for image enhancement. Extensive experiments demonstrate the proposed LDM-RSIC significantly outperforms existing state-of-the-art traditional and learning-based image compression algorithms in terms of both subjective perception and objective metrics. Additionally, we use the LDM-based scheme to improve the traditional image compression algorithm JPEG2000 and obtain 32.00% bit savings on the DOTA testing set. The code will be available at https://github.com/mlkk518/LDM-RSIC.
翻译:基于深度学习的图像压缩算法通常侧重于设计编码与解码网络,并提升熵模型估计的准确性,以优化率失真(RD)性能。然而,鲜有算法利用现有压缩算法产生的压缩失真先验来改进RD性能。本文提出一种基于潜在扩散模型的遥感图像压缩(LDM-RSIC)方法,旨在通过利用LDM生成的失真先验来提升遥感图像的最终解码质量。我们的方法包含两个阶段:第一阶段,自编码器从高质量输入图像中学习先验;第二阶段,先验通过LDM生成,并以现有基于学习的图像压缩算法的解码图像为条件,作为生成纹理丰富增强图像的辅助信息。为更有效地利用先验,我们在基于Transformer的多尺度增强网络(MEN)中嵌入了通道注意力与门控动态特征注意力模块(DFAM)以进行图像增强。大量实验表明,所提出的LDM-RSIC在主观感知与客观指标上均显著优于现有最先进的传统及基于学习的图像压缩算法。此外,我们利用基于LDM的方案改进了传统图像压缩算法JPEG2000,在DOTA测试集上实现了32.00%的码率节省。代码将在https://github.com/mlkk518/LDM-RSIC 公开。