Diffusion-based extreme image compression methods have achieved impressive performance at extremely low bitrates. However, constrained by the iterative denoising process that starts from pure noise, these methods are limited in both fidelity and efficiency. To address these two issues, we present Relay Residual Diffusion Extreme Image Compression (RDEIC), which leverages compressed feature initialization and residual diffusion. Specifically, we first use the compressed latent features of the image with added noise, instead of pure noise, as the starting point to eliminate the unnecessary initial stages of the denoising process. Second, we design a novel relay residual diffusion that reconstructs the raw image by iteratively removing the added noise and the residual between the compressed and target latent features. Notably, our relay residual diffusion network seamlessly integrates pre-trained stable diffusion to leverage its robust generative capability for high-quality reconstruction. Third, we propose a fixed-step fine-tuning strategy to eliminate the discrepancy between the training and inference phases, further improving the reconstruction quality. Extensive experiments demonstrate that the proposed RDEIC achieves state-of-the-art visual quality and outperforms existing diffusion-based extreme image compression methods in both fidelity and efficiency. The source code will be provided in https://github.com/huai-chang/RDEIC.
翻译:基于扩散的极限图像压缩方法在极低比特率下已取得令人瞩目的性能。然而,受限于从纯噪声开始的迭代去噪过程,这些方法在保真度和效率方面均存在局限。为解决这两个问题,本文提出了中继残差扩散极限图像压缩方法,该方法利用了压缩特征初始化与残差扩散技术。具体而言,我们首先使用添加噪声后的图像压缩潜在特征(而非纯噪声)作为起始点,以消除去噪过程中不必要的初始阶段。其次,我们设计了一种新颖的中继残差扩散机制,通过迭代去除添加的噪声以及压缩潜在特征与目标潜在特征之间的残差来重建原始图像。值得注意的是,我们的中继残差扩散网络无缝集成了预训练的稳定扩散模型,以利用其强大的生成能力实现高质量重建。第三,我们提出了一种固定步长微调策略,以消除训练与推理阶段之间的差异,进一步提升重建质量。大量实验表明,所提出的RDEIC方法在视觉质量上达到了最先进水平,并在保真度和效率方面均优于现有基于扩散的极限图像压缩方法。源代码将在https://github.com/huai-chang/RDEIC提供。