We propose a general framework for optimizing noise schedules in diffusion models, applicable to both training and sampling. Our method enforces a constant rate of change in the probability distribution of diffused data throughout the diffusion process, where the rate of change is quantified using a user-defined discrepancy measure. We introduce three such measures, which can be flexibly selected or combined depending on the domain and model architecture. While our framework is inspired by theoretical insights, we do not aim to provide a complete theoretical justification of how distributional change affects sample quality. Instead, we focus on establishing a general-purpose scheduling framework and validating its empirical effectiveness. Through extensive experiments, we demonstrate that our approach consistently improves the performance of both pixel-space and latent-space diffusion models, across various datasets, samplers, and a wide range of number of function evaluations from 5 to 250. In particular, when applied to both training and sampling schedules, our method achieves a state-of-the-art FID score of 2.03 on LSUN Horse 256$\times$256, without compromising mode coverage.
翻译:我们提出了一种优化扩散模型中噪声调度的通用框架,适用于训练和采样两个阶段。该方法在整个扩散过程中强制要求扩散数据的概率分布以恒定速率变化,该变化速率通过用户定义的差异度量进行量化。我们引入了三种此类度量,可根据具体领域和模型架构灵活选择或组合。虽然本框架的灵感源于理论洞见,但我们并不旨在为分布变化如何影响样本质量提供完整的理论证明。相反,我们专注于建立一个通用调度框架并验证其实际有效性。通过大量实验,我们证明该方法在各种数据集、采样器以及5到250次函数评估的广泛范围内,均能持续提升像素空间和潜空间扩散模型的性能。特别地,当同时应用于训练和采样调度时,我们的方法在LSUN Horse 256×256数据集上取得了2.03的最先进FID分数,且未损害模态覆盖度。