This paper investigates score-based diffusion models when the underlying target distribution is concentrated on or near low-dimensional manifolds within the higher-dimensional space in which they formally reside, a common characteristic of natural image distributions. Despite previous efforts to understand the data generation process of diffusion models, existing theoretical support remains highly suboptimal in the presence of low-dimensional structure, which we strengthen in this paper. For the popular Denoising Diffusion Probabilistic Model (DDPM), we find that the dependency of the error incurred within each denoising step on the ambient dimension $d$ is in general unavoidable. We further identify a unique design of coefficients that yields a converges rate at the order of $O(k^{2}/\sqrt{T})$ (up to log factors), where $k$ is the intrinsic dimension of the target distribution and $T$ is the number of steps. This represents the first theoretical demonstration that the DDPM sampler can adapt to unknown low-dimensional structures in the target distribution, highlighting the critical importance of coefficient design. All of this is achieved by a novel set of analysis tools that characterize the algorithmic dynamics in a more deterministic manner.
翻译:本文研究基于分数的扩散模型,当底层目标分布集中于或邻近于其形式所在高维空间内的低维流形时的情况,这是自然图像分布的常见特征。尽管先前已有研究试图理解扩散模型的数据生成过程,但在存在低维结构的情况下,现有理论支持仍远非最优,本文对此进行了强化。针对流行的去噪扩散概率模型(DDPM),我们发现每个去噪步骤中产生的误差对环境维度$d$的依赖通常是不可避免的。我们进一步识别出一种独特的系数设计,能够实现$O(k^{2}/\sqrt{T})$量级的收敛速率(忽略对数因子),其中$k$是目标分布的内在维度,$T$是步骤数。这首次从理论上证明了DDPM采样器能够适应目标分布中未知的低维结构,凸显了系数设计的关键重要性。所有这些成果均通过一套新颖的分析工具实现,这些工具以更确定性的方式刻画了算法动态。