Diffusion models have become the most popular approach to deep generative modeling of images, largely due to their empirical performance and reliability. From a theoretical standpoint, a number of recent works~\cite{chen2022,chen2022improved,benton2023linear} have studied the iteration complexity of sampling, assuming access to an accurate diffusion model. In this work, we focus on understanding the \emph{sample complexity} of training such a model; how many samples are needed to learn an accurate diffusion model using a sufficiently expressive neural network? Prior work~\cite{BMR20} showed bounds polynomial in the dimension, desired Total Variation error, and Wasserstein error. We show an \emph{exponential improvement} in the dependence on Wasserstein error and depth, along with improved dependencies on other relevant parameters.
翻译:扩散模型已成为图像深度生成建模中最流行的方法,主要得益于其卓越的实证性能与可靠性。从理论角度看,近期多项研究~\cite{chen2022,chen2022improved,benton2023linear}探讨了在假设拥有精确扩散模型条件下的采样迭代复杂度。本文致力于理解训练此类模型的样本复杂度:即需要多少样本才能利用足够表达能力的神经网络学习到精确的扩散模型?先前工作~\cite{BMR20}给出了与维度、期望总变分误差及Wasserstein误差相关的多项式级界限。我们证明了在Wasserstein误差和深度依赖关系上的指数级改进,并优化了其他相关参数的依赖关系。