Sampling from diffusion models can be treated as solving the corresponding ordinary differential equations (ODEs), with the aim of obtaining an accurate solution with as few number of function evaluations (NFE) as possible. Recently, various fast samplers utilizing higher-order ODE solvers have emerged and achieved better performance than the initial first-order one. However, these numerical methods inherently result in certain approximation errors, which significantly degrades sample quality with extremely small NFE (e.g., around 5). In contrast, based on the geometric observation that each sampling trajectory almost lies in a two-dimensional subspace embedded in the ambient space, we propose Approximate MEan-Direction Solver (AMED-Solver) that eliminates truncation errors by directly learning the mean direction for fast diffusion sampling. Besides, our method can be easily used as a plugin to further improve existing ODE-based samplers. Extensive experiments on image synthesis with the resolution ranging from 32 to 512 demonstrate the effectiveness of our method. With only 5 NFE, we achieve 6.61 FID on CIFAR-10, 10.74 FID on ImageNet 64$\times$64, and 13.20 FID on LSUN Bedroom. Our code is available at https://github.com/zju-pi/diff-sampler.
翻译:扩散模型的采样可视为求解对应的常微分方程(ODE),目标是在尽可能少的函数评估次数(NFE)下获得精确解。近年来,利用高阶ODE求解器的各类快速采样器相继涌现,其性能优于最初的阶求解器。然而,这些数值方法会引入固有的近似误差,在NFE极小的场景(例如约5步)中显著降低样本质量。相反,基于每条采样轨迹几乎都嵌在环境空间二维子空间中的几何观测,我们提出近似均值方向求解器(AMED-Solver),通过直接学习均值方向来消除截断误差,实现快速扩散采样。此外,该方法可作为插件灵活应用于现有基于ODE的采样器以提升性能。在32至512分辨率图像合成任务上的大量实验验证了方法的有效性。仅需5次NFE,我们在CIFAR-10上取得6.61 FID,在ImageNet 64×64上取得10.74 FID,在LSUN Bedroom数据集上取得13.20 FID。代码开源于 https://github.com/zju-pi/diff-sampler。