Diffusion models are a powerful method for generating approximate samples from high-dimensional data distributions. Several recent results have provided polynomial bounds on the convergence rate of such models, assuming $L^2$-accurate score estimators. However, up until now the best known such bounds were either superlinear in the data dimension or required strong smoothness assumptions. We provide the first convergence bounds which are linear in the data dimension (up to logarithmic factors) assuming only finite second moments of the data distribution. We show that diffusion models require at most $\tilde O(\frac{d \log^2(1/\delta)}{\varepsilon^2})$ steps to approximate an arbitrary data distribution on $\mathbb{R}^d$ corrupted with Gaussian noise of variance $\delta$ to within $\varepsilon^2$ in Kullback--Leibler divergence. Our proof builds on the Girsanov-based methods of previous works. We introduce a refined treatment of the error arising from the discretization of the reverse SDE, which is based on tools from stochastic localization.
翻译:扩散模型是一种从高维数据分布生成近似样本的强大方法。最近若干研究在假设具备$L^2$精确得分估计器的情况下,提供了此类模型收敛速率的多项式界。然而,迄今已知的最佳结果要么在数据维度上呈超线性增长,要么需要强光滑性假设。我们首次证明在仅假设数据分布具有有限二阶矩时,收敛界与数据维度呈线性关系(最多相差对数因子)。我们证明:对于$\mathbb{R}^d$上任意被方差为$\delta$的高斯噪声污染的数据分布,扩散模型至多需要$\tilde O(\frac{d \log^2(1/\delta)}{\varepsilon^2})$步即可在Kullback-Leibler散度意义下实现$\varepsilon^2$精度的近似。证明过程基于前期工作中的Girsanov方法,并引入了一种针对逆向随机微分方程离散化误差的精细处理方法,该方法利用了随机局部化工具。