The Principles of Diffusion Models

This book presents the core principles that have guided the development of diffusion models, tracing their origins and showing how diverse formulations arise from shared mathematical ideas. Diffusion modeling starts by defining a forward process that gradually corrupts data into noise, linking the data distribution to a simple prior through a continuum of intermediate distributions. The goal is to learn a reverse process that transforms noise back into data while recovering the same intermediates. We describe three complementary views. The variational view, inspired by variational autoencoders, sees diffusion as learning to remove noise step by step. The score-based view, rooted in energy-based modeling, learns the gradient of the evolving data distribution, indicating how to nudge samples toward more likely regions. The flow-based view, related to normalizing flows, treats generation as following a smooth path that moves samples from noise to data under a learned velocity field. These perspectives share a common backbone: a time-dependent velocity field whose flow transports a simple prior to the data. Sampling then amounts to solving a differential equation that evolves noise into data along a continuous trajectory. On this foundation, the book discusses guidance for controllable generation, efficient numerical solvers, and diffusion-motivated flow-map models that learn direct mappings between arbitrary times. It provides a conceptual and mathematically grounded understanding of diffusion models for readers with basic deep-learning knowledge.

翻译：本书阐述了指导扩散模型发展的核心原理，追溯其起源，并展示了多种公式化表述如何源自共同的数学思想。扩散建模首先定义了一个前向过程，该过程逐步将数据破坏为噪声，通过一系列中间分布将数据分布与简单先验联系起来。目标是学习一个反向过程，该过程将噪声转换回数据，同时恢复相同的中间分布。我们描述了三种互补的视角。受变分自编码器启发的变分视角将扩散视为逐步学习去除噪声的过程。基于能量建模的得分视角学习演化数据分布的梯度，指示如何将样本推向更可能的区域。与归一化流相关的流形视角将生成过程视为遵循一条平滑路径，在学习的速度场下将样本从噪声移动到数据。这些视角共享一个共同框架：一个时变的速度场，其流形将简单先验传输到数据。因此，采样相当于求解一个微分方程，该方程沿连续轨迹将噪声演化为数据。在此基础之上，本书讨论了可控生成的引导、高效数值求解器以及受扩散启发的流图模型，这些模型学习任意时间之间的直接映射。它为具备基础深度学习知识的读者提供了对扩散模型的概念性和数学性的理解。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

最新《扩散模型原理》新书，470页pdf

专知会员服务

73+阅读 · 2025年10月30日