Popularized by their strong image generation performance, diffusion and related methods for generative modeling have found widespread success in visual media applications. In particular, diffusion methods have enabled new approaches to data compression, where realistic reconstructions can be generated at extremely low bit-rates. This article provides a unifying review of recent diffusion-based methods for generative lossy compression, with a focus on image compression. These methods generally encode the source into an embedding and employ a diffusion model to iteratively refine it in the decoding procedure, such that the final reconstruction approximately follows the ground truth data distribution. The embedding can take various forms and is typically transmitted via an auxiliary entropy model, and recent methods also explore the use of diffusion models themselves for information transmission via channel simulation. We review representative approaches through the lens of rate-distortion-perception theory, highlighting the role of common randomness and connections to inverse problems, and identify open challenges.
翻译:凭借其卓越的图像生成性能,扩散及相关生成建模方法已在视觉媒体应用中获得广泛成功。特别地,扩散方法为数据压缩开辟了新途径,使得在极低比特率下生成逼真重建成为可能。本文对近期基于扩散的生成式有损压缩方法进行了系统性综述,重点关注图像压缩领域。这类方法通常将源数据编码为嵌入表示,并在解码过程中利用扩散模型对其进行迭代优化,从而使最终重建结果近似遵循真实数据分布。嵌入表示可呈现多种形式,通常通过辅助熵模型进行传输;近期研究亦探索利用扩散模型本身通过信道模拟实现信息传输。我们通过率失真感知理论的视角审视代表性方法,强调公共随机性的作用及其与逆问题的关联,并指出当前面临的开放性问题。