Our goal is to extend the denoising diffusion implicit model (DDIM) to general diffusion models~(DMs) besides isotropic diffusions. Instead of constructing a non-Markov noising process as in the original DDIM, we examine the mechanism of DDIM from a numerical perspective. We discover that the DDIM can be obtained by using some specific approximations of the score when solving the corresponding stochastic differential equation. We present an interpretation of the accelerating effects of DDIM that also explains the advantages of a deterministic sampling scheme over the stochastic one for fast sampling. Building on this insight, we extend DDIM to general DMs, coined generalized DDIM (gDDIM), with a small but delicate modification in parameterizing the score network. We validate gDDIM in two non-isotropic DMs: Blurring diffusion model (BDM) and Critically-damped Langevin diffusion model (CLD). We observe more than 20 times acceleration in BDM. In the CLD, a diffusion model by augmenting the diffusion process with velocity, our algorithm achieves an FID score of 2.26, on CIFAR10, with only 50 number of score function evaluations~(NFEs) and an FID score of 2.86 with only 27 NFEs. Code is available at https://github.com/qsh-zh/gDDIM
翻译:我们的目标是将去噪扩散隐式模型(DDIM)扩展到除各向同性扩散之外的一般扩散模型(DMs)。与原始DDIM中构建非马尔可夫加噪过程不同,我们从数值角度审视DDIM的机制。我们发现,DDIM可通过在求解对应随机微分方程时对分数函数采用特定近似来实现。我们提出了一种对DDIM加速效果的解释,该解释也阐明了确定性采样方案相比随机采样方案在快速采样中的优势。基于这一见解,我们通过对分数网络参数化进行微小而精妙的修改,将DDIM扩展为广义DDIM(gDDIM)。我们在两种非各向同性扩散模型中验证了gDDIM:模糊扩散模型(BDM)和临界阻尼朗之万扩散模型(CLD)。在BDM中,我们观察到超过20倍的加速效果。在CLD(一种通过速度增强扩散过程的扩散模型)中,我们的算法在CIFAR10数据集上仅需50次分数函数评估(NFEs)即可达到2.26的FID分数,仅需27次NFEs即可达到2.86的FID分数。代码开源在 https://github.com/qsh-zh/gDDIM