Recent advancements in diffusion-based generative priors have enabled visually plausible image compression at extremely low bit rates. However, existing approaches suffer from slow sampling processes and suboptimal bit allocation due to fragmented training paradigms. In this work, we propose Accelerate \textbf{Diff}usion-based Image Compression via \textbf{C}onsistency Prior \textbf{R}efinement (DiffCR), a novel compression framework for efficient and high-fidelity image reconstruction. At the heart of DiffCR is a Frequency-aware Skip Estimation (FaSE) module that refines the $ε$-prediction prior from a pre-trained latent diffusion model and aligns it with compressed latents at different timesteps via Frequency Decoupling Attention (FDA). Furthermore, a lightweight consistency estimator enables fast \textbf{two-step decoding} by preserving the semantic trajectory of diffusion sampling. Without updating the backbone diffusion model, DiffCR achieves substantial bitrate savings (27.2\% BD-rate (LPIPS) and 65.1\% BD-rate (PSNR)) and over $10\times$ speed-up compared to SOTA diffusion-based compression baselines.
翻译:基于扩散的生成先验的最新进展使得在极低码率下实现视觉可信的图像压缩成为可能。然而,现有方法因训练范式碎片化而存在采样过程缓慢和码率分配次优的问题。本文提出一种新颖的压缩框架——通过一致性先验优化加速基于扩散的图像压缩(DiffCR),以实现高效且高保真的图像重建。DiffCR的核心是一个频率感知跳跃估计(FaSE)模块,该模块对预训练潜在扩散模型的$ε$预测先验进行优化,并通过频率解耦注意力(FDA)机制将其与不同时间步的压缩潜在表示对齐。此外,一个轻量级的一致性估计器通过保持扩散采样的语义轨迹,实现了快速的**两步解码**。在不更新骨干扩散模型的情况下,与基于扩散的SOTA压缩基线相比,DiffCR实现了显著的码率节省(27.2% BD-rate (LPIPS) 和 65.1% BD-rate (PSNR))以及超过$10\times$的加速。