An elementary approach to characterizing the impact of noise scheduling and time discretization in generative diffusion models is developed. Considering a simplified model where the source distribution is multivariate Gaussian with a given covariance matrix, the explicit closed-form evolution trajectory of the distributions across reverse sampling steps is derived, and consequently, the Kullback-Leibler (KL) divergence between the source distribution and the reverse sampling output is obtained. The effect of the number of time discretization steps on the convergence of this KL divergence is studied via the Euler-Maclaurin expansion. An optimization problem is formulated, and its solution noise schedule is obtained via calculus of variations, shown to follow a tangent law whose coefficient is determined by the eigenvalues of the source covariance matrix. For an alternative scenario, more realistic in practice, where pretrained models have been obtained for some given noise schedules, the KL divergence also provides a measure to compare different time discretization strategies in reverse sampling. Experiments across different datasets and pretrained models demonstrate that the time discretization strategy selected by our approach consistently outperforms baseline and search-based strategies, particularly when the budget on the number of function evaluations is very tight.
翻译:本文提出了一种基本方法来刻画生成式扩散模型中噪声调度与时间离散化的影响。考虑一个简化模型,其中源分布为具有给定协方差矩阵的多元高斯分布,我们推导了反向采样步骤中分布演化的显式闭式轨迹,进而得到了源分布与反向采样输出之间的Kullback-Leibler(KL)散度。通过欧拉-麦克劳林展开,研究了时间离散化步数对该KL散度收敛性的影响。我们构建了一个优化问题,并通过变分法求解得到其最优噪声调度方案,该方案遵循正切定律,其系数由源协方差矩阵的特征值确定。针对另一种更贴近实际的场景——即针对某些给定噪声调度已获得预训练模型的情况,KL散度同样为比较反向采样中不同时间离散化策略提供了度量标准。在不同数据集和预训练模型上的实验表明,通过本方法选取的时间离散化策略始终优于基线策略和基于搜索的策略,尤其在函数评估次数预算极为有限的情况下表现更为突出。