Recently, a series of diffusion-aware distillation algorithms have emerged to alleviate the computational overhead associated with the multi-step inference process of Diffusion Models (DMs). Current distillation techniques often dichotomize into two distinct aspects: i) ODE Trajectory Preservation; and ii) ODE Trajectory Reformulation. However, these approaches suffer from severe performance degradation or domain shifts. To address these limitations, we propose Hyper-SD, a novel framework that synergistically amalgamates the advantages of ODE Trajectory Preservation and Reformulation, while maintaining near-lossless performance during step compression. Firstly, we introduce Trajectory Segmented Consistency Distillation to progressively perform consistent distillation within pre-defined time-step segments, which facilitates the preservation of the original ODE trajectory from a higher-order perspective. Secondly, we incorporate human feedback learning to boost the performance of the model in a low-step regime and mitigate the performance loss incurred by the distillation process. Thirdly, we integrate score distillation to further improve the low-step generation capability of the model and offer the first attempt to leverage a unified LoRA to support the inference process at all steps. Extensive experiments and user studies demonstrate that Hyper-SD achieves SOTA performance from 1 to 8 inference steps for both SDXL and SD1.5. For example, Hyper-SDXL surpasses SDXL-Lightning by +0.68 in CLIP Score and +0.51 in Aes Score in the 1-step inference.
翻译:近年来,一系列扩散感知蒸馏算法相继出现,旨在缓解扩散模型多步推理过程带来的计算开销。当前的蒸馏技术通常分为两个截然不同的方面:i)ODE轨迹保持;ii)ODE轨迹重构。然而,这些方法存在严重的性能下降或领域偏移问题。为应对这些局限,我们提出Hyper-SD,一个新颖的框架,它协同融合了ODE轨迹保持与重构的优势,同时在步数压缩过程中保持近乎无损的性能。首先,我们引入轨迹分段一致性蒸馏,在预定义的时间步分段内逐步执行一致性蒸馏,这有助于从更高阶的视角保持原始ODE轨迹。其次,我们融入人类反馈学习,以提升模型在低步数下的性能,并减轻蒸馏过程带来的性能损失。第三,我们整合分数蒸馏以进一步增强模型的低步数生成能力,并首次尝试利用统一的LoRA来支持所有步数的推理过程。大量实验和用户研究表明,Hyper-SD在SDXL和SD1.5上,从1到8推理步数均实现了最先进的性能。例如,在1步推理中,Hyper-SDXL在CLIP分数上超越SDXL-Lightning 0.68分,在美学分数上超越0.51分。