Diffusion models have become the most popular approach to deep generative modeling of images, largely due to their empirical performance and reliability. From a theoretical standpoint, a number of recent works~\cite{chen2022,chen2022improved,benton2023linear} have studied the iteration complexity of sampling, assuming access to an accurate diffusion model. In this work, we focus on understanding the \emph{sample complexity} of training such a model; how many samples are needed to learn an accurate diffusion model using a sufficiently expressive neural network? Prior work~\cite{BMR20} showed bounds polynomial in the dimension, desired Total Variation error, and Wasserstein error. We show an \emph{exponential improvement} in the dependence on Wasserstein error and depth, along with improved dependencies on other relevant parameters.
翻译:扩散模型已成为图像深度生成建模中最流行的方法,这主要归功于其卓越的经验性能和可靠性。从理论角度来看,近期多项研究~\cite{chen2022,chen2022improved,benton2023linear} 在假设能够获得精确扩散模型的前提下,对采样过程的迭代复杂度进行了深入探讨。本文聚焦于理解训练此类模型的 \emph{样本复杂度}:即在使用表达能力足够强的神经网络时,需要多少样本才能学习到一个精确的扩散模型?先前的研究~\cite{BMR20} 给出了关于维度、期望总变差误差和Wasserstein误差的多项式依赖界限。我们证明了在Wasserstein误差和网络深度依赖关系上的 \emph{指数级改进},同时优化了与其他相关参数的依赖关系。