Diffusion models, which employ stochastic differential equations to sample images through integrals, have emerged as a dominant class of generative models. However, the rationality of the diffusion process itself receives limited attention, leaving the question of whether the problem is well-posed and well-conditioned. In this paper, we explore a perplexing tendency of diffusion models: they often display the infinite Lipschitz property of the network with respect to time variable near the zero point. We provide theoretical proofs to illustrate the presence of infinite Lipschitz constants and empirical results to confirm it. The Lipschitz singularities pose a threat to the stability and accuracy during both the training and inference processes of diffusion models. Therefore, the mitigation of Lipschitz singularities holds great potential for enhancing the performance of diffusion models. To address this challenge, we propose a novel approach, dubbed E-TSDM, which alleviates the Lipschitz singularities of the diffusion model near the zero point of timesteps. Remarkably, our technique yields a substantial improvement in performance. Moreover, as a byproduct of our method, we achieve a dramatic reduction in the Fr\'echet Inception Distance of acceleration methods relying on network Lipschitz, including DDIM and DPM-Solver, by over 33%. Extensive experiments on diverse datasets validate our theory and method. Our work may advance the understanding of the general diffusion process, and also provide insights for the design of diffusion models.
翻译:扩散模型通过随机微分方程积分采样图像,已成为生成模型的主导类别。然而,扩散过程本身的合理性受到的关注有限,其问题是否适定且良态尚存疑问。本文探讨了扩散模型中一个令人困惑的趋势:它们常在时间变量零点附近表现出网络关于时间的无限Lipschitz性质。我们提供了理论证明来说明无限Lipschitz常数的存在,并通过实证结果加以验证。Lipschitz奇异性对扩散模型训练和推理过程的稳定性与准确性构成威胁。因此,缓解Lipschitz奇异性对提升扩散模型性能具有巨大潜力。为解决这一挑战,我们提出了一种名为E-TSDM的新方法,该方法在时间步零点附近缓解了扩散模型的Lipschitz奇异性。值得注意的是,我们的技术带来了显著的性能提升。此外,作为我们方法的副产品,依赖网络Lipschitz的加速方法(包括DDIM和DPM-Solver)的Fr\'echet Inception Distance实现了超过33%的大幅降低。在不同数据集上的大量实验验证了我们的理论和方法。本工作可能推进对一般扩散过程的理解,并为扩散模型的设计提供见解。