The substantial computational costs of diffusion models, particularly due to the repeated denoising steps crucial for high-quality image generation, present a major obstacle to their widespread adoption. While several studies have attempted to address this issue by reducing the number of score function evaluations using advanced ODE solvers without fine-tuning, the decreased number of denoising iterations misses the opportunity to update fine details, resulting in noticeable quality degradation. In our work, we introduce an advanced acceleration technique that leverages the temporal redundancy inherent in diffusion models. Reusing feature maps with high temporal similarity opens up a new opportunity to save computation without sacrificing output quality. To realize the practical benefits of this intuition, we conduct an extensive analysis and propose a novel method, FRDiff. FRDiff is designed to harness the advantages of both reduced NFE and feature reuse, achieving a Pareto frontier that balances fidelity and latency trade-offs in various generative tasks.
翻译:扩散模型因高质量图像生成所需的重复去噪步骤而产生巨大计算成本,这成为其广泛部署的主要障碍。尽管已有研究尝试通过先进ODE求解器在无需微调的情况下减少评分函数评估次数,但去噪迭代次数的降低会错失更新细节的机会,导致明显的质量退化。本文提出一种利用扩散模型时间冗余特性的先进加速技术。通过复用具有高时间相似性的特征图,该技术在不牺牲输出质量的前提下开辟了计算节约的新途径。为实现该思路的实际效益,我们进行了深入分析并提出创新方法FRDiff。FRDiff旨在融合减少NFE与特征复用的双重优势,在各类生成任务中实现了平衡保真度与延迟的帕累托前沿。