We propose Coreset-Induced Conditional Velocity Flow Matching (CCVFM), a generative model that augments hierarchical rectified flow with a data-informed source distribution. Hierarchical flow matching models the full conditional velocity law in velocity space, but its inner flow is asked to transport isotropic Gaussian noise to a multimodal target velocity distribution from scratch. Our key observation is that this inner source can be replaced by a closed-form surrogate built from a coreset of the target. CCVFM first compresses the target into weighted atoms using an entropic Sinkhorn coreset and lifts them to a Gaussian mixture. The induced conditional velocity law is then a closed-form Gaussian mixture that can be sampled without a learned neural sampler. A lightweight correction flow, trained from this exact surrogate source, then refines the remaining surrogate-to-target residual rather than learning an entire noise-to-data map. We prove that the surrogate transport cost equals the target--surrogate Wasserstein gap under an explicit compression assumption, whereas the noise-source analogue has a dimension-scale lower bound. We further characterize the conditional second moment of the direct surrogate-source training target and show that its source-dependent excess is small when the surrogate conditional law is close to the true conditional velocity law in mean and covariance. Empirically, on MNIST, CIFAR-10, ImageNet-32, and CelebA-HQ, the proposed method reaches competitive few-step generation under matched architectures.
翻译:我们提出核心集诱导的条件速度流匹配(CCVFM),这是一种生成模型,通过数据驱动的源分布增强分层整流流。分层流匹配在速度空间中建模完整条件速度律,但其内部流被要求从零开始将各向同性高斯噪声输送到多模态目标速度分布。我们的关键观察是,这个内部源可以被由目标核心集构建的闭式替代物所取代。CCVFM首先使用熵化的Sinkhorn核心集将目标压缩为加权原子,并将其提升为高斯混合模型。由此产生的条件速度律成为闭式高斯混合形式,无需学习型神经采样器即可从中采样。一个轻量级的校正流由此精确的替代源训练,仅细化剩余的替代到目标残差,而非学习完整的噪声到数据映射。我们证明,在显式压缩假设下,替代传输成本等于目标-替代Wasserstein差距,而噪声源模拟则具有维度尺度的下界。我们进一步刻画了直接替代源训练目标的条件二阶矩,并证明当替代条件律在均值和协方差上接近真实条件速度律时,其源依赖的超额量很小。在MNIST、CIFAR-10、ImageNet-32和CelebA-HQ上的实验表明,在匹配架构下,所提方法可在少步生成中达到竞争性表现。