Flow matching (FM) trains a time-dependent vector field that transports samples from a simple prior to a complex data distribution. However, for high-dimensional images, each training sample supervises only a single trajectory and intermediate point, yielding an extremely sparse and high-variance training signal. This under-constrained supervision can cause flow collapse, where the learned dynamics memorize specific source-target pairings, mapping diverse inputs to overly similar outputs, failing to generalize. We introduce Posterior-Augmented Flow Matching (PAFM), a theoretically grounded generalization of FM that replaces single-target supervision with an expectation over an approximate posterior of valid target completions for a given intermediate state and condition. PAFM factorizes this intractable posterior into (i) the likelihood of the intermediate under a hypothesized endpoint and (ii) the prior probability of that endpoint under the condition, and uses an importance sampling scheme to construct a mixture over multiple candidate targets. We prove that PAFM yields an unbiased estimator of the original FM objective while substantially reducing gradient variance during training by aggregating information from many plausible continuation trajectories per intermediate. Finally, we show that PAFM improves over FM by up to 3.4 FID50K across different model scales (SiT-B/2 and SiT-XL/2), different architectures (SiT and MMDiT), and in both class and text conditioned benchmarks (ImageNet and CC12M), with a negligible increase in the compute overhead. Code: https://github.com/gstoica27/PAFM.git.
翻译:流匹配(FM)通过训练一个时间依赖的向量场,将样本从简单先验分布传输至复杂数据分布。然而,对于高维图像,每个训练样本仅监督单条轨迹和中间点,导致训练信号极度稀疏且具有高方差。这种欠约束监督可能引发流坍缩——学习到的动力学过程会记忆特定的源-目标耦合关系,将多样化输入映射至过度相似的输出,从而丧失泛化能力。我们提出后验增强流匹配(PAFM),作为FM的理论泛化框架,将单目标监督替换为基于给定中间状态与条件的有效目标完成近似后验的期望。PAFM将这一难解后验分解为:(i) 假设终点下中间状态似然,与 (ii) 条件下该终点的先验概率,并采用重要性采样方案构建多个候选目标的混合分布。我们证明,PAFM在生成原始FM目标的无偏估计量的同时,通过整合每个中间状态对应的多个合理延续轨迹信息,显著降低训练期间的梯度方差。最终研究表明,在不同模型规模(SiT-B/2和SiT-XL/2)、不同架构(SiT和MMDiT)以及类别与文本条件基准(ImageNet和CC12M)下,PAFM相较FM可实现最高3.4 FID50K的提升,且计算开销增长可忽略不计。代码:https://github.com/gstoica27/PAFM.git。