The slow inference process of image diffusion models significantly degrades interactive user experiences. To address this, we introduce Diffusion Preview, a novel paradigm employing rapid, low-step sampling to generate preliminary outputs for user evaluation, deferring full-step refinement until the preview is deemed satisfactory. Existing acceleration methods, including training-free solvers and post-training distillation, struggle to deliver high-quality previews or ensure consistency between previews and final outputs. We propose ConsistencySolver derived from general linear multistep methods, a lightweight, trainable high-order solver optimized via Reinforcement Learning, that enhances preview quality and consistency. Experimental results demonstrate that ConsistencySolver significantly improves generation quality and consistency in low-step scenarios, making it ideal for efficient preview-and-refine workflows. Notably, it achieves FID scores on-par with Multistep DPM-Solver using 47% fewer steps, while outperforming distillation baselines. Furthermore, user studies indicate our approach reduces overall user interaction time by nearly 50% while maintaining generation quality. Code is available at https://github.com/G-U-N/consolver.
翻译:图像扩散模型的缓慢推理过程显著降低了交互式用户体验。为解决该问题,我们提出扩散预览(Diffusion Preview)这一新范式:通过快速低步采样生成初步输出供用户评估,待预览结果满意后再进行全步细化。现有加速方法(包括免训练求解器与训练后蒸馏)难以提供高质量预览,或无法保证预览与最终输出间的一致性。我们提出基于一般线性多步法推导的一致性求解器(ConsistencySolver),这是一种经强化学习优化的轻量级可训练高阶求解器,可提升预览质量与一致性。实验表明,在低步场景下,一致性求解器显著改善了生成质量与一致性,适用于高效的预览-细化工作流。值得注意的是,其FID分数与多步DPM-Solver相当,但步数减少47%,同时优于蒸馏基线方法。此外,用户研究显示,本方法在保持生成质量的同时,将用户总交互时间降低近50%。代码已开源:https://github.com/G-U-N/consolver。