In this work, we investigate an intriguing and prevalent phenomenon of diffusion models which we term as "consistent model reproducibility": given the same starting noise input and a deterministic sampler, different diffusion models often yield remarkably similar outputs. We confirm this phenomenon through comprehensive experiments, implying that different diffusion models consistently reach the same data distribution and scoring function regardless of diffusion model frameworks, model architectures, or training procedures. More strikingly, our further investigation implies that diffusion models are learning distinct distributions affected by the training data size. This is supported by the fact that the model reproducibility manifests in two distinct training regimes: (i) "memorization regime", where the diffusion model overfits to the training data distribution, and (ii) "generalization regime", where the model learns the underlying data distribution. Our study also finds that this valuable property generalizes to many variants of diffusion models, including those for conditional use, solving inverse problems, and model fine-tuning. Finally, our work raises numerous intriguing theoretical questions for future investigation and highlights practical implications regarding training efficiency, model privacy, and the controlled generation of diffusion models.
翻译:本文研究了扩散模型中一个有趣且普遍存在的现象,我们称之为“一致的模型可重复性”:给定相同的起始噪声输入和确定性采样器,不同的扩散模型往往产生非常相似的输出。我们通过全面的实验证实了这一现象,这意味着无论扩散模型框架、模型架构或训练过程如何,不同的扩散模型都一致地达到相同的数据分布和评分函数。更引人注目的是,我们的进一步研究表明,扩散模型学习的是受训练数据大小影响的独特分布。这一发现得到以下事实的支持:模型可重复性体现在两种不同的训练机制中:(i)“记忆机制”,其中扩散模型过拟合到训练数据分布;(ii)“泛化机制”,其中模型学习底层数据分布。我们的研究还发现,这一有价值的特性可以推广到扩散模型的许多变体,包括用于条件生成、解决逆问题和模型微调的变体。最后,我们的工作提出了许多引人深思的理论问题以供未来研究,并强调了在训练效率、模型隐私和扩散模型可控生成方面的实际意义。