The rate-distortion-perception (RDP) tradeoff characterizes the fundamental limits of lossy compression by jointly considering bitrate, reconstruction fidelity, and perceptual quality. While recent neural compression methods have improved perceptual performance, they typically operate at fixed points on the RDP surface, requiring retraining to target different tradeoffs. In this work, we propose a training-free framework that leverages pre-trained diffusion models to traverse the entire RDP surface. Our approach integrates a reverse channel coding (RCC) module with a novel score-scaled probability flow ODE decoder. We theoretically prove that the proposed diffusion decoder is optimal for the distortion-perception tradeoff under AWGN observations and that the overall framework with the RCC module achieves the optimal RDP function in the Gaussian case. Empirical results across multiple datasets demonstrate the framework's flexibility and effectiveness in navigating the ternary RDP tradeoff using pre-trained diffusion models. Our results establish a practical and theoretically grounded approach to adaptive, perception-aware compression.
翻译:率-失真-感知(RDP)权衡通过联合考虑比特率、重建保真度和感知质量,刻画了有损压缩的基本极限。尽管近期神经压缩方法提升了感知性能,但它们通常仅在RDP曲面的固定点上操作,需要重新训练以针对不同的权衡点。本工作提出一种无训练框架,利用预训练扩散模型遍历整个RDP曲面。该方法将反向信道编码(RCC)模块与新颖的分数缩放概率流常微分方程解码器相结合。我们理论证明了所提出的扩散解码器在加性高斯白噪声观测条件下对失真-感知权衡是最优的,且结合RCC模块的整体框架在高斯情况下可达到最优RDP函数。在多个数据集上的实证结果表明,该框架能够利用预训练扩散模型灵活有效地导航三元RDP权衡。我们的研究为自适应感知感知压缩建立了一种实用且理论完备的途径。