Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaption

Although recent generative image compression methods have demonstrated impressive potential in optimizing the rate-distortion-perception trade-off, they still face the critical challenge of flexible rate adaption to diverse compression necessities and scenarios. To overcome this challenge, this paper proposes a Controllable Generative Image Compression framework, Control-GIC, the first capable of fine-grained bitrate adaption across a broad spectrum while ensuring high-fidelity and generality compression. We base Control-GIC on a VQGAN framework representing an image as a sequence of variable-length codes (i.e. VQ-indices), which can be losslessly compressed and exhibits a direct positive correlation with the bitrates. Therefore, drawing inspiration from the classical coding principle, we naturally correlate the information density of local image patches with their granular representations, to achieve dynamic adjustment of the code quantity following different granularity decisions. This implies we can flexibly determine a proper allocation of granularity for the patches to acquire desirable compression rates. We further develop a probabilistic conditional decoder that can trace back to historic encoded multi-granularity representations according to transmitted codes, and then reconstruct hierarchical granular features in the formalization of conditional probability, enabling more informative aggregation to improve reconstruction realism. Our experiments show that Control-GIC allows highly flexible and controllable bitrate adaption and even once compression on an entire dataset to fulfill constrained bitrate conditions. Experimental results demonstrate its superior performance over recent state-of-the-art methods.

翻译：尽管近期生成式图像压缩方法在优化率失真感知权衡方面展现出巨大潜力，但其仍面临适应多样化压缩需求与场景的灵活码率调节这一关键挑战。为克服此挑战，本文提出一种可控生成式图像压缩框架Control-GIC，该框架首次能够在宽频谱范围内实现细粒度码率自适应，同时确保高保真度与通用性压缩。我们将Control-GIC构建于VQGAN框架之上，该框架将图像表示为可变长度编码序列（即VQ索引），此类编码可进行无损压缩且与码率呈直接正相关。因此，受经典编码原理启发，我们自然地将局部图像块的信息密度与其粒度表征相关联，从而依据不同粒度决策动态调整编码数量。这意味着我们能够灵活确定各图像块的合适粒度分配以获得期望的压缩率。我们进一步开发了概率条件解码器，该解码器能够根据已传输的编码回溯历史编码的多粒度表征，随后以条件概率形式化重构分层粒度特征，实现更具信息量的聚合以提升重建真实感。实验表明，Control-GIC支持高度灵活可控的码率自适应，甚至可对整个数据集执行单次压缩以满足受限码率条件。实验结果证明其性能优于当前最先进方法。