Diffusion models have emerged as state-of-the-art in image generation, but their practical deployment is hindered by the significant computational cost of their iterative denoising process. While existing caching techniques can accelerate inference, they often create a challenging trade-off between speed and fidelity, suffering from quality degradation and high computational overhead. To address these limitations, we introduce H2-Cache, a novel hierarchical caching mechanism designed for modern generative diffusion model architectures. Our method is founded on the key insight that the denoising process can be functionally separated into a structure-defining stage and a detail-refining stage. H2-cache leverages this by employing a dual-threshold system, using independent thresholds to selectively cache each stage. To ensure the efficiency of our dual-check approach, we introduce pooled feature summarization (PFS), a lightweight technique for robust and fast similarity estimation. Extensive experiments on the Flux architecture demonstrate that H2-cache achieves significant acceleration (up to 5.08x) while maintaining image quality nearly identical to the baseline, quantitatively and qualitatively outperforming existing caching methods. Our work presents a robust and practical solution that effectively resolves the speed-quality dilemma, significantly lowering the barrier for the real-world application of high-fidelity diffusion models. Source code is available at https://github.com/Bluear7878/H2-cache-A-Hierarchical-Dual-Stage-Cache.
翻译:扩散模型已成为图像生成领域的最先进技术,但其迭代去噪过程的高计算成本阻碍了实际部署。现有的缓存技术虽能加速推理,但往往在速度与保真度之间形成难以权衡的取舍,存在质量下降和计算开销高的问题。为应对这些局限,我们提出了H2-Cache,一种专为现代生成扩散模型架构设计的层次化缓存机制。该方法基于一个关键洞见:去噪过程在功能上可分离为结构定义阶段和细节细化阶段。H2-Cache利用这一特性,采用双阈值系统,通过独立阈值对每个阶段进行选择性缓存。为确保双校验方法的效率,我们引入了池化特征摘要(PFS),这是一种轻量级技术,用于实现鲁棒且快速的相似性估计。在Flux架构上的大量实验表明,H2-Cache实现了显著加速(最高达5.08倍),同时保持与基线几乎一致的图像质量,在定量和定性上均优于现有缓存方法。我们的工作提供了一个鲁棒且实用的解决方案,有效解决了速度与质量的权衡困境,显著降低了高保真扩散模型在实际应用中的门槛。源代码发布于https://github.com/Bluear7878/H2-cache-A-Hierarchical-Dual-Stage-Cache。