Flow diffusion models (FDMs) have recently shown potential in generation tasks due to the high generation quality. However, the current ordinary differential equation (ODE) solver for FDMs, e.g., the Euler solver, still suffers from slow generation since ODE solvers need many number function evaluations (NFE) to keep high-quality generation. In this paper, we propose a novel training-free flow-solver to reduce NFE while maintaining high-quality generation. The key insight for the flow-solver is to leverage the previous steps to reduce the NFE, where a cache is created to reuse these results from the previous steps. Specifically, the Taylor expansion is first used to approximate the ODE. To calculate the high-order derivatives of Taylor expansion, the flow-solver proposes to use the previous steps and a polynomial interpolation to approximate it, where the number of orders we could approximate equals the number of previous steps we cached. We also prove that the flow-solver has a more minor approximation error and faster generation speed. Experimental results on the CIFAR-10, CelebA-HQ, LSUN-Bedroom, LSUN-Church, ImageNet, and real text-to-image generation prove the efficiency of the flow-solver. Specifically, the flow-solver improves the FID-30K from 13.79 to 6.75, from 46.64 to 19.49 with $\text{NFE}=10$ on CIFAR-10 and LSUN-Church, respectively.
翻译:流扩散模型(FDM)因其高质量的生成能力,近年来在生成任务中展现出潜力。然而,当前用于FDM的常微分方程(ODE)求解器(例如欧拉求解器)仍面临生成速度缓慢的问题,因为ODE求解器需要大量的函数评估次数(NFE)以保持高质量生成。本文提出一种新颖的免训练流求解器,旨在减少NFE的同时维持高质量生成。该流求解器的核心思想是利用历史步长来降低NFE,通过建立缓存机制复用先前步长的计算结果。具体而言,首先利用泰勒展开近似ODE。为计算泰勒展开的高阶导数,流求解器提出使用历史步长并结合多项式插值进行近似,其中可近似的阶数等于缓存的历史步长数量。我们还证明了该流求解器具有更小的近似误差和更快的生成速度。在CIFAR-10、CelebA-HQ、LSUN-Bedroom、LSUN-Church、ImageNet以及真实文本到图像生成任务上的实验结果验证了流求解器的有效性。具体而言,在CIFAR-10和LSUN-Church数据集上,当NFE=10时,流求解器将FID-30K指标分别从13.79提升至6.75、从46.64提升至19.49。