While efficient distribution learning is no doubt behind the groundbreaking success of diffusion modeling, its theoretical guarantees are quite limited. In this paper, we provide the first rigorous analysis on approximation and generalization abilities of diffusion modeling for well-known function spaces. The highlight of this paper is that when the true density function belongs to the Besov space and the empirical score matching loss is properly minimized, the generated data distribution achieves the nearly minimax optimal estimation rates in the total variation distance and in the Wasserstein distance of order one. Furthermore, we extend our theory to demonstrate how diffusion models adapt to low-dimensional data distributions. We expect these results advance theoretical understandings of diffusion modeling and its ability to generate verisimilar outputs.
翻译:虽然高效分布学习无疑是扩散模型开创性成功背后的关键因素,但其理论保证仍然相当有限。本文针对扩散模型在著名函数空间上的逼近与泛化能力提供了首次严格分析。本文的核心亮点在于:当真实密度函数属于贝索夫空间且经验分数匹配损失被恰当最小化时,生成数据分布在总变差距离与一阶瓦瑟斯坦距离上达到了近乎最优的极小极大估计速率。此外,我们进一步扩展理论,揭示了扩散模型如何适应低维数据分布。我们期望这些成果能够推动对扩散模型的理论理解,并深化其生成逼真输出的能力。