Sampling from diffusion probabilistic models (DPMs) can be viewed as a piecewise distribution transformation, which generally requires hundreds or thousands of steps of the inverse diffusion trajectory to get a high-quality image. Recent progress in designing fast samplers for DPMs achieves a trade-off between sampling speed and sample quality by knowledge distillation or adjusting the variance schedule or the denoising equation. However, it can't be optimal in both aspects and often suffer from mode mixture in short steps. To tackle this problem, we innovatively regard inverse diffusion as an optimal transport (OT) problem between latents at different stages and propose the DPM-OT, a unified learning framework for fast DPMs with a direct expressway represented by OT map, which can generate high-quality samples within around 10 function evaluations. By calculating the semi-discrete optimal transport map between the data latents and the white noise, we obtain an expressway from the prior distribution to the data distribution, while significantly alleviating the problem of mode mixture. In addition, we give the error bound of the proposed method, which theoretically guarantees the stability of the algorithm. Extensive experiments validate the effectiveness and advantages of DPM-OT in terms of speed and quality (FID and mode mixture), thus representing an efficient solution for generative modeling. Source codes are available at https://github.com/cognaclee/DPM-OT
翻译:从扩散概率模型(DPM)中采样可视为逐段分布变换,通常需要数百或数千步逆扩散轨迹才能获得高质量图像。近来针对DPM的快速采样器设计取得了进展,通过知识蒸馏、调整方差调度或去噪方程,在采样速度与样本质量之间实现了权衡。然而,该方法无法在两方面同时达到最优,且在步数较少时常常面临模式混合问题。为解决这一问题,我们创新性地将逆扩散视为不同阶段潜变量之间的最优输运(OT)问题,并提出DPM-OT——一个统一的快速DPM学习框架,该框架通过最优输运映射构建直接快速通道,仅需约10次函数评估即可生成高质量样本。通过计算数据潜变量与白噪声之间的半离散最优输运映射,我们获得了从先验分布到数据分布的快速通道,同时显著缓解了模式混合问题。此外,我们给出了所提方法的误差界,从理论上保证了算法的稳定性。大量实验验证了DPM-OT在速度与质量(FID及模式混合)方面的有效性与优势,从而为生成式建模提供了一种高效解决方案。源代码见 https://github.com/cognaclee/DPM-OT