Region-of-Interest (ROI)-based image compression allocates bits unevenly according to the semantic importance of different regions. Such differentiated coding typically induces a sharp-peaked and heavy-tailed distribution. This distribution characteristic mathematically necessitates a probability model with adaptable shape parameters for accurate description. However, existing methods commonly use a Gaussian model to fit this distribution, resulting in a loss of coding performance. To systematically analyze the impact of this distribution on ROI coding, we develop a unified rate-distortion optimization theoretical paradigm. Building on this paradigm, we propose a novel Generalized Gaussian Model (GGM) to achieve flexible modeling of the latent variables distribution. To support stable optimization of GGM, we introduce effective differentiable functions and further propose a dynamic lower bound to alleviate train-test mismatch. Moreover, finite differences are introduced to solve the gradient computation after GGM fits the distribution. Experiments on COCO2017 demonstrate that our method achieves state-of-the-art in both ROI reconstruction and downstream tasks (e.g., Segmentation, Object Detection). Furthermore, compared to classical probability models, our GGM provides a more precise fit to feature distributions and achieves superior coding performance. The project page is at https://github.com/hukai-tju/ROIGGM.
翻译:基于感兴趣区域(ROI)的图像压缩根据不同区域的语义重要性进行非均匀比特分配。这种差异化编码通常会产生尖峰厚尾的分布。该分布特征在数学上要求使用具有可调形状参数的概率模型进行精确描述。然而,现有方法通常使用高斯模型来拟合此分布,导致编码性能损失。为了系统分析该分布对ROI编码的影响,我们建立了一个统一的率失真优化理论范式。基于此范式,我们提出了一种新颖的广义高斯模型(GGM),以实现对潜在变量分布的灵活建模。为支持GGM的稳定优化,我们引入了有效的可微函数,并进一步提出动态下界以缓解训练-测试失配问题。此外,我们引入有限差分法来解决GGM拟合分布后的梯度计算问题。在COCO2017数据集上的实验表明,我们的方法在ROI重建和下游任务(例如分割、目标检测)中均达到了最先进的性能。此外,与经典概率模型相比,我们的GGM能够更精确地拟合特征分布,并实现更优的编码性能。项目页面位于 https://github.com/hukai-tju/ROIGGM。