Denoising diffusion probabilistic models (DDPMs) represent an entirely new class of generative AI methods that have yet to be fully explored. They use Langevin dynamics, represented as stochastic differential equations, to describe a process that transforms data into noise, the forward process, and a process that transforms noise into generated data, the reverse process. Many of these methods utilize auxiliary variables that formulate the data as a ``position" variable, and the auxiliary variables are referred to as ``velocity", ``acceleration", etc. In this sense, it is possible to ``critically damp" the dynamics. Critical damping has been successfully introduced in Critically-Damped Langevin Dynamics (CLD) and Critically-Damped Third-Order Langevin Dynamics (TOLD++), but has not yet been applied to dynamics of arbitrary order. The proposed methodology generalizes Higher-Order Langevin Dynamics (HOLD), a recent state-of-the-art diffusion method, by introducing the concept of critical damping from systems analysis. Similarly to TOLD++, this work proposes an optimal set of hyperparameters in the $n$-dimensional case, where HOLD leaves these to be user defined. Additionally, this work provides closed-form solutions for the mean and covariance of the forward process that greatly simplify its implementation. Experiments are performed on the CIFAR-10 and CelebA-HQ $256 \times 256$ datasets, and validated against the FID metric.
翻译:去噪扩散概率模型(DDPMs)代表了一类尚未被充分探索的全新生成式人工智能方法。它们使用朗之万动力学(以随机微分方程表示)来描述两个过程:将数据转化为噪声的正向过程,以及将噪声转化为生成数据的反向过程。许多此类方法利用辅助变量,将数据表述为“位置”变量,而辅助变量则被称为“速度”、“加速度”等。从这个意义上讲,可以对动力学进行“临界阻尼”处理。临界阻尼已成功应用于临界阻尼朗之万动力学(CLD)和临界阻尼三阶朗之万动力学(TOLD++),但尚未推广到任意阶的动力学中。本文提出的方法通过引入系统分析中的临界阻尼概念,对近期最先进的扩散方法——高阶朗之万动力学(HOLD)进行了推广。与TOLD++类似,本工作在$n$维情况下提出了一组最优超参数,而HOLD中这些参数需由用户自行定义。此外,本工作还为正向过程的均值和协方差提供了闭式解,从而极大简化了其实现。实验在CIFAR-10和CelebA-HQ $256 \times 256$数据集上进行,并使用FID指标进行验证。