We provide theoretical convergence guarantees for score-based generative models (SGMs) such as denoising diffusion probabilistic models (DDPMs), which constitute the backbone of large-scale real-world generative models such as DALL$\cdot$E 2. Our main result is that, assuming accurate score estimates, such SGMs can efficiently sample from essentially any realistic data distribution. In contrast to prior works, our results (1) hold for an $L^2$-accurate score estimate (rather than $L^\infty$-accurate); (2) do not require restrictive functional inequality conditions that preclude substantial non-log-concavity; (3) scale polynomially in all relevant problem parameters; and (4) match state-of-the-art complexity guarantees for discretization of the Langevin diffusion, provided that the score error is sufficiently small. We view this as strong theoretical justification for the empirical success of SGMs. We also examine SGMs based on the critically damped Langevin diffusion (CLD). Contrary to conventional wisdom, we provide evidence that the use of the CLD does not reduce the complexity of SGMs.
翻译:我们为基于得分的生成模型(SGMs),如去噪扩散概率模型(DDPMs)提供了理论收敛保证,这些模型构成了大规模现实生成模型(如DALL·E 2)的核心支柱。我们的主要结果是:假设得分估计准确,此类SGMs能够有效地从几乎任何现实数据分布中采样。与以往研究相比,我们的结果:(1)适用于$L^2$精度得分估计(而非$L^\infty$精度);(2)不需要排除显著非对数凹性的严格函数不等式条件;(3)在所有相关问题参数上呈多项式缩放;且(4)在得分误差足够小时,匹配了Langevin扩散离散化技术的最先进复杂度保证。我们视此结果为SGMs实证成功的有力理论依据。我们还研究了基于临界阻尼Langevin扩散(CLD)的SGMs。与传统观点相反,我们提供了证据表明使用CLD不会降低SGMs的复杂度。