We present a novel systematic theoretical framework to analyze the rate-distortion (R-D) limits of learned image compression. While recent neural codecs have achieved remarkable empirical results, their distance from the information-theoretic limit remains unclear. Our work addresses this gap by decomposing the R-D performance loss into three key components: variance estimation, quantization strategy, and context modeling. First, we derive the optimal latent variance as the second moment under a Gaussian assumption, providing a principled alternative to hyperprior-based estimation. Second, we quantify the gap between uniform quantization and the Gaussian test channel derived from the reverse water-filling theorem. Third, we extend our framework to include context modeling, and demonstrate that accurate mean prediction yields substantial entropy reduction. Unlike prior R-D estimators, our method provides a structurally interpretable perspective that aligns with real compression modules and enables fine-grained analysis. Through joint simulation and end-to-end training, we derive a tight and actionable approximation of the theoretical R-D limits, offering new insights into the design of more efficient learned compression systems.
翻译:本文提出了一种新颖的系统性理论框架,用于分析学习图像压缩的率失真极限。尽管近期神经编解码器取得了显著的实证结果,但其与信息论极限的距离仍不明确。我们的工作通过将率失真性能损失分解为三个关键组成部分来解决这一差距:方差估计、量化策略和上下文建模。首先,我们在高斯假设下推导出最优潜在方差作为二阶矩,为基于超先验的估计提供了理论依据的替代方案。其次,我们量化了均匀量化与由反向注水定理推导出的高斯测试信道之间的差距。第三,我们将框架扩展至包含上下文建模,并证明准确的均值预测能带来显著的熵减。与先前的率失真估计器不同,我们的方法提供了结构可解释的视角,该视角与实际压缩模块保持一致,并支持细粒度分析。通过联合仿真与端到端训练,我们推导出理论率失真极限的紧致且可操作的近似,为设计更高效的学习压缩系统提供了新的见解。