Diffusion models and Flow Matching generate high-quality samples but are slow at inference, and distilling them into few-step models often leads to instability and extensive tuning. To resolve these trade-offs, we propose Inductive Moment Matching (IMM), a new class of generative models for one- or few-step sampling with a single-stage training procedure. Unlike distillation, IMM does not require pre-training initialization and optimization of two networks; and unlike Consistency Models, IMM guarantees distribution-level convergence and remains stable under various hyperparameters and standard model architectures. IMM surpasses diffusion models on ImageNet-256x256 with 1.99 FID using only 8 inference steps and achieves state-of-the-art 2-step FID of 1.98 on CIFAR-10 for a model trained from scratch.
翻译:扩散模型与流匹配模型虽能生成高质量样本,但其推理速度缓慢,且将其蒸馏为少步模型常导致训练不稳定与大量调参。为化解这些权衡,我们提出归纳矩匹配——一类通过单阶段训练即可实现一步或少量步数采样的新型生成模型。与蒸馏方法不同,IMM无需预训练初始化及双网络优化;与一致性模型相比,IMM能保证分布层面的收敛性,且在多种超参数与标准模型架构下保持稳定。在ImageNet-256x256数据集上,IMM仅用8步推理即以1.99的FID超越扩散模型;在CIFAR-10数据集上,从头训练的模型以2步推理实现了当前最优的1.98 FID。