Diffusion models (DMs) have been adopted across diverse fields with its remarkable abilities in capturing intricate data distributions. In this paper, we propose a Fast Diffusion Model (FDM) to significantly speed up DMs from a stochastic optimization perspective for both faster training and sampling. We first find that the diffusion process of DMs accords with the stochastic optimization process of stochastic gradient descent (SGD) on a stochastic time-variant problem. Then, inspired by momentum SGD that uses both gradient and an extra momentum to achieve faster and more stable convergence than SGD, we integrate momentum into the diffusion process of DMs. This comes with a unique challenge of deriving the noise perturbation kernel from the momentum-based diffusion process. To this end, we frame the process as a Damped Oscillation system whose critically damped state -- the kernel solution -- avoids oscillation and yields a faster convergence speed of the diffusion process. Empirical results show that our FDM can be applied to several popular DM frameworks, e.g., VP, VE, and EDM, and reduces their training cost by about 50% with comparable image synthesis performance on CIFAR-10, FFHQ, and AFHQv2 datasets. Moreover, FDM decreases their sampling steps by about 3x to achieve similar performance under the same samplers. The code is available at https://github.com/sail-sg/FDM.
翻译:扩散模型因其在捕捉复杂数据分布方面的卓越能力,已被广泛应用于各个领域。本文从随机优化的角度提出了一种快速扩散模型(FDM),以显著加速扩散模型的训练与采样过程。我们首先发现,扩散模型的扩散过程与随机梯度下降(SGD)在随机时变问题上的随机优化过程相一致。随后,受动量SGD(利用梯度与额外动量实现比SGD更快且更稳定的收敛)的启发,我们将动量集成到扩散模型的扩散过程中。这带来了一个独特挑战:如何从基于动量的扩散过程中推导出噪声扰动核。为此,我们将该过程建模为阻尼振荡系统,其临界阻尼态——即核函数解——可避免振荡并实现扩散过程更快的收敛速度。实验结果表明,我们的FDM可应用于多种主流扩散模型框架(如VP、VE和EDM),在CIFAR-10、FFHQ和AFHQv2数据集上以约50%的训练成本达到可比的图像合成性能;此外,在相同采样器下,FDM将采样步数减少约3倍即可实现相近性能。代码已开源至https://github.com/sail-sg/FDM。