Imperfect score-matching leads to a shift between the training and the sampling distribution of diffusion models. Due to the recursive nature of the generation process, errors in previous steps yield sampling iterates that drift away from the training distribution. Yet, the standard training objective via Denoising Score Matching (DSM) is only designed to optimize over non-drifted data. To train on drifted data, we propose to enforce a \emph{consistency} property which states that predictions of the model on its own generated data are consistent across time. Theoretically, we show that if the score is learned perfectly on some non-drifted points (via DSM) and if the consistency property is enforced everywhere, then the score is learned accurately everywhere. Empirically we show that our novel training objective yields state-of-the-art results for conditional and unconditional generation in CIFAR-10 and baseline improvements in AFHQ and FFHQ. We open-source our code and models: https://github.com/giannisdaras/cdm
翻译:不完美的分数匹配会导致扩散模型的训练分布与采样分布之间存在偏移。由于生成过程具有递归性,前一步骤中的误差会使采样迭代偏离训练分布。然而,标准训练目标(基于去噪分数匹配)仅设计用于非偏移数据的优化。为了在偏移数据上进行训练,我们提出强制实施一种**一致性**属性,即模型在其自身生成的数据上的预测在不同时间步上保持一致。理论上,我们证明:如果通过去噪分数匹配在某些非偏移点上完美学习到分数,且一致性属性在所有位置上得到强制执行,那么分数在所有位置上都能被准确学习。实验上,我们展示了这一新型训练目标在CIFAR-10数据集上实现了条件生成和无条件生成的最优结果,并在AFHQ和FFHQ数据集上取得基线改进。我们已开源代码和模型:https://github.com/giannisdaras/cdm