Diffusion probabilistic models have recently achieved remarkable success in generating high quality image and video data. In this work, we build on this class of generative models and introduce a method for lossy compression of high resolution images. The resulting codec, which we call DIffuson-based Residual Augmentation Codec (DIRAC), is the first neural codec to allow smooth traversal of the rate-distortion-perception tradeoff at test time, while obtaining competitive performance with GAN-based methods in perceptual quality. Furthermore, while sampling from diffusion probabilistic models is notoriously expensive, we show that in the compression setting the number of steps can be drastically reduced.
翻译:扩散概率模型近期在生成高质量图像与视频数据方面取得了显著成功。本研究基于此类生成模型,提出了一种针对高分辨率图像的有损压缩方法。所构建的编解码器——我们称之为基于扩散的残差增强编解码器(DIRAC)——是首个能够在测试时平滑遍历率-失真-感知权衡的神经编解码器,同时在感知质量方面与基于GAN的方法具有竞争性表现。此外,尽管扩散概率模型的采样过程通常计算代价高昂,但我们表明在压缩场景下,其采样步数可显著减少。