Diffusion models have significantly advanced the field of generative modeling. However, training a diffusion model is computationally expensive, creating a pressing need to adapt off-the-shelf diffusion models for downstream generation tasks. Current fine-tuning methods focus on parameter-efficient transfer learning but overlook the fundamental transfer characteristics of diffusion models. In this paper, we investigate the transferability of diffusion models and observe a monotonous chain of forgetting trend of transferability along the reverse process. Based on this observation and novel theoretical insights, we present Diff-Tuning, a frustratingly simple transfer approach that leverages the chain of forgetting tendency. Diff-Tuning encourages the fine-tuned model to retain the pre-trained knowledge at the end of the denoising chain close to the generated data while discarding the other noise side. We conduct comprehensive experiments to evaluate Diff-Tuning, including the transfer of pre-trained Diffusion Transformer models to eight downstream generations and the adaptation of Stable Diffusion to five control conditions with ControlNet. Diff-Tuning achieves a 26% improvement over standard fine-tuning and enhances the convergence speed of ControlNet by 24%. Notably, parameter-efficient transfer learning techniques for diffusion models can also benefit from Diff-Tuning.
翻译:扩散模型显著推动了生成建模领域的发展。然而,训练扩散模型的计算成本高昂,这迫切要求我们能够将现成的扩散模型适配到下游生成任务中。当前的微调方法侧重于参数高效的迁移学习,但忽视了扩散模型的基本迁移特性。本文研究了扩散模型的可迁移性,并观察到沿着反向过程存在一种单调的遗忘链趋势。基于这一观察和新的理论见解,我们提出了Diff-Tuning,一种利用遗忘链趋势的极其简单的迁移方法。Diff-Tuning鼓励微调后的模型在去噪链末端(靠近生成数据)保留预训练知识,同时舍弃噪声侧的其他部分。我们进行了全面的实验来评估Diff-Tuning,包括将预训练的Diffusion Transformer模型迁移到八个下游生成任务,以及将Stable Diffusion适配到五种带有ControlNet的控制条件。Diff-Tuning相比标准微调实现了26%的性能提升,并将ControlNet的收敛速度提高了24%。值得注意的是,针对扩散模型的参数高效迁移学习技术也能从Diff-Tuning中受益。