Score-based generative modeling (SGM) is a highly successful approach for learning a probability distribution from data and generating further samples. We prove the first polynomial convergence guarantees for the core mechanic behind SGM: drawing samples from a probability density $p$ given a score estimate (an estimate of $\nabla \ln p$) that is accurate in $L^2(p)$. Compared to previous works, we do not incur error that grows exponentially in time or that suffers from a curse of dimensionality. Our guarantee works for any smooth distribution and depends polynomially on its log-Sobolev constant. Using our guarantee, we give a theoretical analysis of score-based generative modeling, which transforms white-noise input into samples from a learned data distribution given score estimates at different noise scales. Our analysis gives theoretical grounding to the observation that an annealed procedure is required in practice to generate good samples, as our proof depends essentially on using annealing to obtain a warm start at each step. Moreover, we show that a predictor-corrector algorithm gives better convergence than using either portion alone.
翻译:得分基生成建模(Score-based Generative Modeling,SGM)是一种从数据中学习概率分布并生成更多样本的高度成功的方法。我们首次证明了SGM核心机制的多项式收敛保证:给定一个在$L^2(p)$上精确的分数估计($\nabla \ln p$的估计),从概率密度$p$中抽取样本。与先前的工作相比,我们不会产生随时间指数增长或遭受维度灾难的误差。我们的保证适用于任何光滑分布,并与其对数Sobolev常数呈多项式相关。利用我们的保证,我们对得分基生成建模进行了理论分析,该模型将白噪声输入转换为从学习到的数据分布中生成的样本,并依赖于不同噪声尺度下的分数估计。我们的分析为实践中生成高质量样本所需的退火过程这一观察提供了理论基础,因为我们的证明本质上依赖于使用退火在每个步骤获得热启动。此外,我们展示了预测-校正算法比单独使用其中任一算法具有更好的收敛性。