Diffusion models generate samples by estimating the score function of the target distribution at various noise levels. The model is trained using samples drawn from the target distribution by progressively adding noise. Previous sample complexity bounds have polynomial dependence on the dimension $d$, apart from a $\log(|\mathcal{H}|)$ term, where $\mathcal{H}$ is the hypothesis class. In this work, we establish the first (nearly) dimension-free sample complexity bounds, modulo the $\log(|\mathcal{H}|)$ dependence, for learning these score functions, achieving a double exponential improvement in the dimension over prior results. A key aspect of our analysis is the use of a single function approximator to jointly estimate scores across noise levels, a practical feature that enables generalization across time steps. We introduce a martingale-based error decomposition and sharp variance bounds, enabling efficient learning from dependent data generated by Markov processes, which may be of independent interest. Building on these insights, we propose Bootstrapped Score Matching (BSM), a variance reduction technique that leverages previously learned scores to improve accuracy at higher noise levels. These results provide insights into the efficiency and effectiveness of diffusion models for generative modeling.
翻译:扩散模型通过估计目标分布在不同噪声水平下的分数函数来生成样本。该模型利用从目标分布中抽取的样本,通过逐步添加噪声进行训练。以往的样本复杂度界限除对数项$\log(|\mathcal{H}|)$外(其中$\mathcal{H}$为假设类),均对维度$d$具有多项式依赖。本工作首次建立了学习此类分数函数的(近乎)维度无关样本复杂度界限(模去$\log(|\mathcal{H}|)$依赖),在维度上实现了对先前结果的双指数级改进。我们分析的关键在于使用单一函数逼近器联合估计跨噪声水平的分数,这一实用特性实现了时间步间的泛化能力。我们引入了基于鞅的误差分解与尖锐方差界限,使得能够从马尔可夫过程生成的依赖数据中进行高效学习,该方法可能具有独立研究价值。基于这些发现,我们提出自举分数匹配(BSM)——一种方差缩减技术,利用先前已学习的分数提升高噪声水平下的估计精度。这些结果为理解扩散模型在生成建模中的效率与有效性提供了新的理论视角。