Diffusion models and Flow Matching generate high-quality samples but are slow at inference, and distilling them into few-step models often leads to instability and extensive tuning. To resolve these trade-offs, we propose Inductive Moment Matching (IMM), a new class of generative models for one- or few-step sampling with a single-stage training procedure. Unlike distillation, IMM does not require pre-training initialization and optimization of two networks; and unlike Consistency Models, IMM guarantees distribution-level convergence and remains stable under various hyperparameters and standard model architectures. IMM surpasses diffusion models on ImageNet-256x256 with 1.99 FID using only 8 inference steps and achieves state-of-the-art 2-step FID of 1.98 on CIFAR-10 for a model trained from scratch.
翻译:扩散模型与流匹配模型虽能生成高质量样本,但推理速度缓慢,将其蒸馏为少步模型常导致训练不稳定且需大量调参。为克服这些权衡问题,我们提出归纳矩匹配——一种通过单阶段训练即可实现一步或少量步数采样的新型生成模型。与蒸馏方法不同,IMM无需预训练初始化及双网络优化;与一致性模型相比,IMM能保证分布层面的收敛性,且在多种超参数配置与标准模型架构下保持稳定。在ImageNet-256x256数据集上,IMM仅用8步推理即以1.99 FID超越扩散模型;在CIFAR-10数据集上,从头训练的模型以2步推理达到1.98 FID,刷新当前最优记录。