Flow Matching (FM) models have emerged as a leading paradigm for high-fidelity synthesis. However, their reliance on iterative Ordinary Differential Equation (ODE) solving creates a significant latency bottleneck. Existing solutions face a dichotomy: training-free solvers suffer from significant performance degradation at low Neural Function Evaluations (NFEs), while training-based one- or few-steps generation methods incur prohibitive training costs and lack plug-and-play versatility. To bridge this gap, we propose the Bi-Anchor Interpolation Solver (BA-solver). BA-solver retains the versatility of standard training-free solvers while achieving significant acceleration by introducing a lightweight SideNet (1-2% backbone size) alongside the frozen backbone. Specifically, our method is founded on two synergistic components: \textbf{1) Bidirectional Temporal Perception}, where the SideNet learns to approximate both future and historical velocities without retraining the heavy backbone; and 2) Bi-Anchor Velocity Integration, which utilizes the SideNet with two anchor velocities to efficiently approximate intermediate velocities for batched high-order integration. By utilizing the backbone to establish high-precision ``anchors'' and the SideNet to densify the trajectory, BA-solver enables large interval sizes with minimized error. Empirical results on ImageNet-256^2 demonstrate that BA-solver achieves generation quality comparable to 100+ NFEs Euler solver in just 10 NFEs and maintains high fidelity in as few as 5 NFEs, incurring negligible training costs. Furthermore, BA-solver ensures seamless integration with existing generative pipelines, facilitating downstream tasks such as image editing.
翻译:流匹配模型已成为高保真合成领域的主流范式。然而,其对迭代常微分方程求解的依赖造成了显著的延迟瓶颈。现有解决方案面临两难困境:免训练求解器在低神经函数评估次数下存在显著的性能退化,而基于训练的一步或少量步生成方法则需承担高昂的训练成本且缺乏即插即用的通用性。为弥合这一差距,我们提出了双锚点插值求解器。该求解器在保持标准免训练求解器通用性的同时,通过引入轻量级侧网络(占主干网络规模的1-2%)与冻结的主干网络协同工作,实现了显著加速。具体而言,我们的方法基于两个协同组件:\textbf{1) 双向时序感知},侧网络无需重新训练重型主干即可学习逼近未来与历史速度场;\textbf{2) 双锚点速度积分},利用侧网络结合两个锚点速度,高效逼近批处理高阶积分所需的中间速度场。通过主干网络建立高精度“锚点”并结合侧网络加密轨迹,BA-求解器能够在大步长条件下实现误差最小化。在ImageNet-256^2上的实验结果表明:BA-求解器仅需10次神经函数评估即可达到与100+次评估的欧拉求解器相当的生成质量,在低至5次评估时仍保持高保真度,且训练成本可忽略不计。此外,该求解器能无缝集成现有生成流程,为图像编辑等下游任务提供便利。