Fine-tuning flow matching models is a central challenge in settings with limited data, evolving distributions, or computational constraints. While recent work has produced significant advances, particularly in the area of reward-based fine-tuning, current methods fail to demonstrate both theoretical correctness as well as strong empirical results in terms of stability, efficiency, and diversity preservation. In this work, we propose Gradual Fine-Tuning (GFT), a simple yet principled annealing-based framework for fine-tuning flow generative models when only samples from the target distribution are available. For stochastic flows, GFT defines a temperature-controlled sequence of intermediate objectives that smoothly interpolate between the pretrained and target drifts, provably approaching the true target as the temperature approaches zero. We analytically demonstrate that sample generation after GFT can be made substantially more efficient with the use of arbitrary (e.g., optimal transport) couplings, as well as by utilizing few-step inference methods. Empirically, GFT significantly improves convergence stability, while maintaining or improving generation quality, training speed, and generation diversity compared to other fine-tuning methods. Our results position GFT as a simple yet theoretically grounded and practically effective alternative for scalable adaptation of flow matching models under distribution shift.
翻译:微调流匹配模型在数据有限、分布演变或计算资源受限的场景中是一项核心挑战。尽管近期研究在基于奖励的微调领域取得了显著进展,但现有方法在稳定性、效率和多样性保持方面,既未能展现理论正确性,也缺乏有力的实证结果。本文提出渐进式微调(GFT),一种简单而原理性的基于退火的微调框架,适用于仅能从目标分布中获取样本的流生成模型。对于随机流,GFT定义了一个受温度控制的中间目标序列,该序列平滑地在预训练漂移与目标漂移之间插值,并在温度趋近于零时渐进收敛于真实目标。我们通过理论分析证明,利用任意(例如最优传输)耦合以及少步推理方法,可以显著提升GFT后样本生成的效率。实验结果表明,与其他微调方法相比,GFT显著提升了收敛稳定性,同时保持或改进了生成质量、训练速度及生成多样性。我们的研究结果证明,GFT是一种在分布偏移下对流匹配模型进行可扩展适配的简单、理论基础扎实且实践有效的替代方案。