DiffFit: Unlocking Transferability of Large Diffusion Models via Simple Parameter-Efficient Fine-Tuning

Diffusion models have proven to be highly effective in generating high-quality images. However, adapting large pre-trained diffusion models to new domains remains an open challenge, which is critical for real-world applications. This paper proposes DiffFit, a parameter-efficient strategy to fine-tune large pre-trained diffusion models that enable fast adaptation to new domains. DiffFit is embarrassingly simple that only fine-tunes the bias term and newly-added scaling factors in specific layers, yet resulting in significant training speed-up and reduced model storage costs. Compared with full fine-tuning, DiffFit achieves 2$\times$ training speed-up and only needs to store approximately 0.12\% of the total model parameters. Intuitive theoretical analysis has been provided to justify the efficacy of scaling factors on fast adaptation. On 8 downstream datasets, DiffFit achieves superior or competitive performances compared to the full fine-tuning while being more efficient. Remarkably, we show that DiffFit can adapt a pre-trained low-resolution generative model to a high-resolution one by adding minimal cost. Among diffusion-based methods, DiffFit sets a new state-of-the-art FID of 3.02 on ImageNet 512$\times$512 benchmark by fine-tuning only 25 epochs from a public pre-trained ImageNet 256$\times$256 checkpoint while being 30$\times$ more training efficient than the closest competitor.

翻译：扩散模型在生成高质量图像方面已被证明极为有效。然而，将大型预训练扩散模型适配到新领域仍然是一项开放挑战，这对实际应用至关重要。本文提出DiffFit，这是一种参数高效策略，用于微调大型预训练扩散模型，使其能够快速适配新领域。DiffFit极其简洁，仅微调特定层中的偏置项和新添加的缩放因子，却能显著提升训练速度并降低模型存储成本。与全参数微调相比，DiffFit实现了2倍的训练加速，且仅需存储约全模型参数的0.12%。我们提供了直观的理论分析，以说明缩放因子在快速适配中的有效性。在8个下游数据集上，DiffFit在保持更高效率的同时，取得了优于或与全参数微调相当的性能。值得注意的是，我们展示了DiffFit能够以最小代价将预训练的低分辨率生成模型适配为高分辨率模型。在基于扩散的方法中，DiffFit通过在ImageNet 512×512基准上仅微调25个周期（基于公开的ImageNet 256×256预训练检查点），将FID刷新至3.02的新纪录，同时训练效率相比最接近的竞争对手提升30倍。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/