Denoising Diffusion Probabilistic Models (DDPM) have recently gained significant attention. DDPMs compose a Markovian process that begins in the data domain and gradually adds noise until reaching pure white noise. DDPMs generate high-quality samples from complex data distributions by defining an inverse process and training a deep neural network to learn this mapping. However, these models are inefficient because they require many diffusion steps to produce aesthetically pleasing samples. Additionally, unlike generative adversarial networks (GANs), the latent space of diffusion models is less interpretable. In this work, we propose to generalize the denoising diffusion process into an Upsampling Diffusion Probabilistic Model (UDPM). In the forward process, we reduce the latent variable dimension through downsampling, followed by the traditional noise perturbation. As a result, the reverse process gradually denoises and upsamples the latent variable to produce a sample from the data distribution. We formalize the Markovian diffusion processes of UDPM and demonstrate its generation capabilities on the popular FFHQ, AFHQv2, and CIFAR10 datasets. UDPM generates images with as few as three network evaluations, whose overall computational cost is less than a single DDPM or EDM step, while achieving an FID score of 6.86. This surpasses current state-of-the-art efficient diffusion models that use a single denoising step for sampling. Additionally, UDPM offers an interpretable and interpolable latent space, which gives it an advantage over traditional DDPMs. Our code is available online: \url{https://github.com/shadyabh/UDPM/}
翻译:去噪扩散概率模型(DDPM)近来受到广泛关注。DDPM构建了一个从数据域开始、逐步添加噪声直至达到纯白噪声的马尔可夫过程。通过定义逆向过程并训练深度神经网络学习该映射,DDPM能够从复杂数据分布中生成高质量样本。然而,这类模型效率较低,需要大量扩散步骤才能产生美学上令人满意的样本。此外,与生成对抗网络(GAN)不同,扩散模型的潜在空间可解释性较弱。本文提出将去噪扩散过程推广为上采样扩散概率模型(UDPM)。在前向过程中,我们通过下采样降低潜在变量维度,随后进行传统的噪声扰动。因此,逆向过程逐步对潜在变量进行去噪和上采样,从而生成数据分布中的样本。我们形式化描述了UDPM的马尔可夫扩散过程,并在FFHQ、AFHQv2和CIFAR10等主流数据集上验证了其生成能力。UDPM仅需三次网络评估即可生成图像,其总计算成本低于单次DDPM或EDM步骤,同时获得6.86的FID分数。这超越了当前采用单步去噪采样的高效扩散模型。此外,UDPM提供了可解释且可插值的潜在空间,相较于传统DDPM具有显著优势。代码已开源:\url{https://github.com/shadyabh/UDPM/}