Diffusion-based generative models have achieved remarkable performance across various domains, yet their practical deployment is often limited by high sampling costs. While prior work focuses on training objectives or individual solvers, the holistic design of sampling, specifically solver selection and scheduling, remains dominated by static heuristics. In this work, we revisit this challenge through a geometric lens, proposing SDM, a principled framework that aligns the numerical solver with the intrinsic properties of the diffusion trajectory. By analyzing the ODE dynamics, we show that efficient low-order solvers suffice in early high-noise stages while higher-order solvers can be progressively deployed to handle the increasing non-linearity of later stages. Furthermore, we formalize the scheduling by introducing a Wasserstein-bounded optimization framework. This method systematically derives adaptive timesteps that explicitly bound the local discretization error, ensuring the sampling process remains faithful to the underlying continuous dynamics. Without requiring additional training or architectural modifications, SDM achieves state-of-the-art performance across standard benchmarks, including an FID of 1.93 on CIFAR-10, 2.41 on FFHQ, and 1.98 on AFHQv2, with a reduced number of function evaluations compared to existing samplers. Our code is available at https://github.com/aiimaginglab/sdm.
翻译:基于扩散的生成模型已在多个领域取得显著性能,但其实际部署常受限于高昂的采样成本。现有研究多集中于训练目标或单一求解器,而采样过程的整体设计——特别是求解器选择与调度策略——仍主要依赖静态启发式方法。本文从几何视角重新审视这一挑战,提出SDM这一原则性框架,将数值求解器与扩散轨迹的内在特性对齐。通过对常微分方程动力学的分析,我们证明低阶高效求解器在早期高噪声阶段已足够使用,而随着后期阶段非线性程度的增加,可逐步部署高阶求解器。进一步地,我们通过引入Wasserstein有界优化框架形式化了调度策略。该方法系统性地推导出自适应时间步,显式约束局部离散化误差,确保采样过程忠实于底层连续动力学。SDM无需额外训练或架构修改,即在标准基准测试中达到最先进性能:在CIFAR-10上获得1.93的FID,在FFHQ上获得2.41的FID,在AFHQv2上获得1.98的FID,且相比现有采样器减少了函数评估次数。代码已发布于https://github.com/aiimaginglab/sdm。