Diffusion Tuning: Transferring Diffusion Models via Chain of Forgetting

Diffusion models have significantly advanced the field of generative modeling. However, training a diffusion model is computationally expensive, creating a pressing need to adapt off-the-shelf diffusion models for downstream generation tasks. Current fine-tuning methods focus on parameter-efficient transfer learning but overlook the fundamental transfer characteristics of diffusion models. In this paper, we investigate the transferability of diffusion models and observe a monotonous chain of forgetting trend of transferability along the reverse process. Based on this observation and novel theoretical insights, we present Diff-Tuning, a frustratingly simple transfer approach that leverages the chain of forgetting tendency. Diff-Tuning encourages the fine-tuned model to retain the pre-trained knowledge at the end of the denoising chain close to the generated data while discarding the other noise side. We conduct comprehensive experiments to evaluate Diff-Tuning, including the transfer of pre-trained Diffusion Transformer models to eight downstream generations and the adaptation of Stable Diffusion to five control conditions with ControlNet. Diff-Tuning achieves a 26% improvement over standard fine-tuning and enhances the convergence speed of ControlNet by 24%. Notably, parameter-efficient transfer learning techniques for diffusion models can also benefit from Diff-Tuning.

翻译：扩散模型显著推动了生成建模领域的发展。然而，训练扩散模型的计算成本高昂，这迫切要求我们能够将现成的扩散模型适配到下游生成任务中。当前的微调方法侧重于参数高效的迁移学习，但忽视了扩散模型的基本迁移特性。本文研究了扩散模型的可迁移性，并观察到沿着反向过程存在一种单调的遗忘链趋势。基于这一观察和新的理论见解，我们提出了Diff-Tuning，一种利用遗忘链趋势的极其简单的迁移方法。Diff-Tuning鼓励微调后的模型在去噪链末端（靠近生成数据）保留预训练知识，同时舍弃噪声侧的其他部分。我们进行了全面的实验来评估Diff-Tuning，包括将预训练的Diffusion Transformer模型迁移到八个下游生成任务，以及将Stable Diffusion适配到五种带有ControlNet的控制条件。Diff-Tuning相比标准微调实现了26%的性能提升，并将ControlNet的收敛速度提高了24%。值得注意的是，针对扩散模型的参数高效迁移学习技术也能从Diff-Tuning中受益。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/