Despite their success in real data synthesis, diffusion models (DMs) often suffer from slow and costly training and sampling issues, limiting their broader applications. To mitigate this, we propose a Fast Diffusion Model (FDM) which improves the diffusion process of DMs from a stochastic optimization perspective to speed up both training and sampling. Specifically, we first find that the diffusion process of DMs accords with the stochastic optimization process of stochastic gradient descent (SGD) on a stochastic time-variant problem. Note that momentum SGD uses both the current gradient and an extra momentum, achieving more stable and faster convergence. We are inspired to introduce momentum into the diffusion process to accelerate both training and sampling. However, this comes with the challenge of deriving the noise perturbation kernel from the momentum-based diffusion process. To this end, we frame the momentum-based process as a Damped Oscillation system whose critically damped state -- the kernel solution -- avoids oscillation and thus has a faster convergence speed of the diffusion process. Empirical results show that our FDM can be applied to several popular DM frameworks, e.g. VP, VE, and EDM, and reduces their training cost by about 50% with comparable image synthesis performance on CIFAR-10, FFHQ, and AFHQv2 datasets. Moreover, FDM decreases their sampling steps by about $3\times$ to achieve similar performance under the same deterministic samplers. The code is available at https://github.com/sail-sg/FDM.
翻译:尽管扩散模型(DM)在真实数据合成中取得了成功,但它们通常面临训练和采样过程缓慢且成本高昂的问题,这限制了其更广泛的应用。为缓解这一问题,我们从随机优化的角度提出了一种快速扩散模型(FDM),通过改进DM的扩散过程来加速训练和采样。具体而言,我们首先发现DM的扩散过程与随机梯度下降(SGD)在随机时变问题上的随机优化过程相一致。注意到动量SGD同时利用当前梯度与额外动量项,能够实现更稳定且更快的收敛。受此启发,我们将动量引入扩散过程以加速训练和采样。然而,这带来了从基于动量的扩散过程中推导噪声扰动核的挑战。为此,我们将基于动量的过程建模为阻尼振荡系统,其临界阻尼状态(即核的解)避免了振荡,从而实现了扩散过程更快的收敛速度。实验结果表明,我们的FDM可应用于多种主流DM框架(如VP、VE和EDM),在CIFAR-10、FFHQ和AFHQv2数据集上,将训练成本降低约50%,同时保持可比的图像合成性能。此外,在相同确定性采样器下,FDM将采样步数减少约3倍,即可达到相似性能。代码已开源:https://github.com/sail-sg/FDM。