Flow-based Generative Models (FGMs) effectively transform noise into complex data distributions. Incorporating Optimal Transport (OT) to couple noise and data during FGM training has been shown to improve the straightness of flow trajectories, enabling more effective inference. However, existing OT-based methods estimate the OT plan using (mini-)batches of sampled noise and data points, which limits their scalability to large and high-dimensional datasets in FGMs. This paper introduces AlignFlow, a novel approach that leverages Semi-Discrete Optimal Transport (SDOT) to enhance the training of FGMs by establishing an explicit, optimal alignment between noise distribution and data points with guaranteed convergence. SDOT computes a transport map by partitioning the noise space into Laguerre cells, each mapped to a corresponding data point. During FGM training, i.i.d. noise samples are paired with data points via the SDOT map. AlignFlow scales well to large datasets and model architectures with negligible computational overhead. Experimental results show that AlignFlow improves the performance of a wide range of state-of-the-art FGM algorithms and can be integrated as a plug-and-play component. Code is available at: https://github.com/konglk1203/AlignFlow.
翻译:基于流的生成模型(FGMs)能够有效地将噪声转换为复杂的数据分布。在FGM训练中引入最优传输(OT)来耦合噪声与数据已被证明可以提升流轨迹的直线性,从而实现更有效的推断。然而,现有的基于OT的方法通过(小)批量采样的噪声与数据点来估计OT方案,这限制了其在FGMs中处理大规模高维数据集的可扩展性。本文提出了AlignFlow,一种利用半离散最优传输(SDOT)的新方法,通过建立噪声分布与数据点之间具有收敛保证的显式最优对齐,以增强FGMs的训练。SDOT通过将噪声空间划分为拉盖尔单元来计算传输映射,每个单元映射到一个对应的数据点。在FGM训练期间,独立同分布的噪声样本通过SDOT映射与数据点配对。AlignFlow能够很好地扩展到大型数据集和模型架构,且计算开销可忽略不计。实验结果表明,AlignFlow提升了多种先进FGM算法的性能,并可作为即插即用组件集成。代码发布于:https://github.com/konglk1203/AlignFlow。