The slow iterative sampling nature remains a major bottleneck for the practical deployment of diffusion and flow-based generative models. While consistency models (CMs) represent a state-of-the-art distillation-based approach for efficient generation, their large-scale application is still limited by two key issues: training instability and inflexible sampling. Existing methods seek to mitigate these problems through architectural adjustments or regularized objectives, yet overlook the critical reliance on trajectory selection. In this work, we first conduct an analysis on these two limitations: training instability originates from loss divergence induced by unstable self-supervised term, whereas sampling inflexibility arises from error accumulation. Based on these insights and analysis, we propose the Dual-End Consistency Model (DE-CM) that selects vital sub-trajectory clusters to achieve stable and effective training. DE-CM decomposes the PF-ODE trajectory and selects three critical sub-trajectories as optimization targets. Specifically, our approach leverages continuous-time CMs objectives to achieve few-step distillation and utilizes flow matching as a boundary regularizer to stabilize the training process. Furthermore, we propose a novel noise-to-noisy (N2N) mapping that can map noise to any point, thereby alleviating the error accumulation in the first step. Extensive experimental results show the effectiveness of our method: it achieves a state-of-the-art FID score of 1.70 in one-step generation on the ImageNet 256x256 dataset, outperforming existing CM-based one-step approaches.
翻译:扩散模型与基于流的生成模型因其缓慢的迭代采样特性,在实际部署中仍是一个主要瓶颈。尽管一致性模型(CMs)代表了基于蒸馏的高效生成方法的最新进展,但其大规模应用仍受限于两个关键问题:训练不稳定性和采样灵活性不足。现有方法试图通过架构调整或正则化目标来缓解这些问题,却忽视了轨迹选择的关键依赖。本工作首先对这两个局限性进行了分析:训练不稳定性源于不稳定的自监督项导致的损失发散,而采样灵活性不足则源于误差累积。基于这些洞见与分析,我们提出了双端一致性模型(DE-CM),通过选择关键子轨迹簇来实现稳定有效的训练。DE-CM 分解 PF-ODE 轨迹,并选择三个关键子轨迹作为优化目标。具体而言,我们的方法利用连续时间一致性模型目标实现少步蒸馏,并采用流匹配作为边界正则化器以稳定训练过程。此外,我们提出了一种新颖的噪声到含噪(N2N)映射,能够将噪声映射至任意点,从而缓解第一步中的误差累积。大量实验结果表明了我们方法的有效性:在 ImageNet 256×256 数据集上,其单步生成的 FID 分数达到 1.70,优于现有基于一致性模型的单步方法。