Denoising Diffusions with Optimal Transport: Localization, Curvature, and Multi-Scale Complexity

Adding noise is easy; what about denoising? Diffusion is easy; what about reverting a diffusion? Diffusion-based generative models aim to denoise a Langevin diffusion chain, moving from a log-concave equilibrium measure $ν$, say an isotropic Gaussian, back to a complex, possibly non-log-concave initial measure $μ$. The score function performs denoising, moving backward in time, and predicting the conditional mean of the past location given the current one. We show that score denoising is the optimal backward map in transportation cost. What is its localization uncertainty? We show that the curvature function determines this localization uncertainty, measured as the conditional variance of the past location given the current. We study in this paper the effectiveness of the diffuse-then-denoise process: the contraction of the forward diffusion chain, offset by the possible expansion of the backward denoising chain, governs the denoising difficulty. For any initial measure $μ$, we prove that this offset net contraction at time $t$ is characterized by the curvature complexity of a smoothed $μ$ at a specific signal-to-noise ratio (SNR) scale $r(t)$. We discover that the multi-scale curvature complexity collectively determines the difficulty of the denoising chain. Our multi-scale complexity quantifies a fine-grained notion of average-case curvature instead of the worst-case. Curiously, it depends on an integrated tail function, measuring the relative mass of locations with positive curvature versus those with negative curvature; denoising at a specific SNR scale is easy if such an integrated tail is light. We conclude with several non-log-concave examples to demonstrate how the multi-scale complexity probes the bottleneck SNR for the diffuse-then-denoise process.

翻译：添加噪声是容易的；去噪又如何呢？扩散是容易的；逆转一个扩散又如何呢？基于扩散的生成模型旨在对朗之万扩散链进行去噪，从一个对数凹的平衡测度$ν$（例如各向同性高斯分布）移回一个复杂、可能非对数凹的初始测度$μ$。分数函数执行去噪，在时间上向后移动，并预测给定当前位置时过去位置的条件均值。我们证明分数去噪是传输成本意义上的最优后向映射。其局部化不确定性是什么？我们证明曲率函数决定了这种局部化不确定性，其度量是给定当前位置时过去位置的条件方差。本文我们研究了“扩散-然后-去噪”过程的有效性：前向扩散链的收缩，被后向去噪链可能的扩张所抵消，共同决定了去噪的难度。对于任意初始测度$μ$，我们证明在时间$t$处这种抵消后的净收缩，是由平滑后的$μ$在特定信噪比（SNR）尺度$r(t)$下的曲率复杂性所刻画的。我们发现多尺度曲率复杂性共同决定了去噪链的难度。我们的多尺度复杂性量化了一种细粒度的平均情况曲率概念，而非最坏情况。有趣的是，它依赖于一个积分尾部函数，该函数衡量了具有正曲率的位置与具有负曲率的位置的相对质量；如果这样的积分尾部较轻，则在特定SNR尺度下的去噪是容易的。最后，我们通过几个非对数凹的例子来说明多尺度复杂性如何探测“扩散-然后-去噪”过程的瓶颈SNR。