In the era of AIGC, the demand for low-budget or even on-device applications of diffusion models emerged. In terms of compressing the Stable Diffusion models (SDMs), several approaches have been proposed, and most of them leveraged the handcrafted layer removal methods to obtain smaller U-Nets, along with knowledge distillation to recover the network performance. However, such a handcrafting manner of layer removal is inefficient and lacks scalability and generalization, and the feature distillation employed in the retraining phase faces an imbalance issue that a few numerically significant feature loss terms dominate over others throughout the retraining process. To this end, we proposed the layer pruning and normalized distillation for compressing diffusion models (LAPTOP-Diff). We, 1) introduced the layer pruning method to compress SDM's U-Net automatically and proposed an effective one-shot pruning criterion whose one-shot performance is guaranteed by its good additivity property, surpassing other layer pruning and handcrafted layer removal methods, 2) proposed the normalized feature distillation for retraining, alleviated the imbalance issue. Using the proposed LAPTOP-Diff, we compressed the U-Nets of SDXL and SDM-v1.5 for the most advanced performance, achieving a minimal 4.0% decline in PickScore at a pruning ratio of 50% while the comparative methods' minimal PickScore decline is 8.2%.
翻译:在AIGC时代,对扩散模型在低预算乃至端侧设备上应用的需求日益增长。针对Stable Diffusion模型(SDMs)的压缩,已有多种方法被提出,其中多数采用手工设计的层移除方法来获得更小的U-Net,并结合知识蒸馏以恢复网络性能。然而,这种手工层移除方式效率低下,且缺乏可扩展性与泛化能力;同时,在重训练阶段采用的特征蒸馏面临不平衡问题,即少数数值显著的特征损失项在整个重训练过程中主导了其他项。为此,我们提出了用于压缩扩散模型的层级剪枝与归一化蒸馏方法(LAPTOP-Diff)。具体而言,我们:1)引入层级剪枝方法以自动压缩SDM的U-Net,并提出一种有效的一次性剪枝准则,其良好的可加性保证了该方法的一次性性能,超越了其他层级剪枝及手工层移除方法;2)提出归一化特征蒸馏用于重训练,缓解了不平衡问题。应用所提出的LAPTOP-Diff,我们对SDXL和SDM-v1.5的U-Net进行了压缩,取得了最先进的性能:在50%的剪枝比例下,PickScore仅下降4.0%,而对比方法的最小PickScore下降为8.2%。