Denoising Diffusion Probabilistic Models (DDPMs) are a very popular class of deep generative model that have been successfully applied to a diverse range of problems including image and video generation, protein and material synthesis, weather forecasting, and neural surrogates of partial differential equations. Despite their ubiquity it is hard to find an introduction to DDPMs which is simple, comprehensive, clean and clear. The compact explanations necessary in research papers are not able to elucidate all of the different design steps taken to formulate the DDPM and the rationale of the steps that are presented is often omitted to save space. Moreover, the expositions are typically presented from the variational lower bound perspective which is unnecessary and arguably harmful as it obfuscates why the method is working and suggests generalisations that do not perform well in practice. On the other hand, perspectives that take the continuous time-limit are beautiful and general, but they have a high barrier-to-entry as they require background knowledge of stochastic differential equations and probability flow. In this note, we distill down the formulation of the DDPM into six simple steps each of which comes with a clear rationale. We assume that the reader is familiar with fundamental topics in machine learning including basic probabilistic modelling, Gaussian distributions, maximum likelihood estimation, and deep learning.
翻译:去噪扩散概率模型(DDPMs)是一类非常流行的深度生成模型,已成功应用于图像与视频生成、蛋白质与材料合成、天气预报以及偏微分方程的神经替代模型等多样领域。尽管其应用广泛,但很难找到一份既简洁、全面又清晰易懂的DDPM入门介绍。研究论文中必要的紧凑解释无法阐明构建DDPM所涉及的所有不同设计步骤,且时常因篇幅限制而省略对步骤背后逻辑的说明。此外,现有阐述通常基于变分下界视角,这一视角既非必要,甚至可能有害,因为它模糊了方法有效的原因,并引出了在实践中表现不佳的泛化形式。另一方面,采用连续时间极限的视角虽然优美且具有普遍性,但其入门门槛较高,需要读者具备随机微分方程和概率流的相关背景知识。在本笔记中,我们将DDPM的构建过程提炼为六个简单步骤,每个步骤都附有清晰的逻辑依据。我们假定读者熟悉机器学习的基础主题,包括基本概率建模、高斯分布、最大似然估计和深度学习。