SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation

In recent years, the development of diffusion models has led to significant progress in image and video generation tasks, with pre-trained models like the Stable Diffusion series playing a crucial role. Inspired by model pruning which lightens large pre-trained models by removing unimportant parameters, we propose a novel model fine-tuning method to make full use of these ineffective parameters and enable the pre-trained model with new task-specified capabilities. In this work, we first investigate the importance of parameters in pre-trained diffusion models, and discover that the smallest 10% to 20% of parameters by absolute values do not contribute to the generation process. Based on this observation, we propose a method termed SaRA that re-utilizes these temporarily ineffective parameters, equating to optimizing a sparse weight matrix to learn the task-specific knowledge. To mitigate overfitting, we propose a nuclear-norm-based low-rank sparse training scheme for efficient fine-tuning. Furthermore, we design a new progressive parameter adjustment strategy to make full use of the re-trained/finetuned parameters. Finally, we propose a novel unstructural backpropagation strategy, which significantly reduces memory costs during fine-tuning. Our method enhances the generative capabilities of pre-trained models in downstream applications and outperforms traditional fine-tuning methods like LoRA in maintaining model's generalization ability. We validate our approach through fine-tuning experiments on SD models, demonstrating significant improvements. SaRA also offers a practical advantage that requires only a single line of code modification for efficient implementation and is seamlessly compatible with existing methods.

翻译：近年来，扩散模型的发展推动了图像与视频生成任务的显著进步，其中以Stable Diffusion系列为代表的预训练模型发挥了关键作用。受模型剪枝通过移除不重要参数以轻量化大型预训练模型的启发，我们提出一种新颖的模型微调方法，旨在充分利用这些低效参数，使预训练模型获得面向新任务的特定能力。本研究首先探究了预训练扩散模型中参数的重要性，发现绝对值最小的10%至20%参数对生成过程无实质性贡献。基于此观察，我们提出名为SaRA的方法，通过重新利用这些暂时低效的参数（等效于优化稀疏权重矩阵以学习任务特定知识）实现高效微调。为抑制过拟合，我们提出基于核范数的低秩稀疏训练方案。进一步地，我们设计了渐进式参数调整策略以充分复用再训练/微调后的参数。最后，我们提出一种新颖的非结构化反向传播策略，可显著降低微调过程中的内存开销。本方法增强了预训练模型在下游应用中的生成能力，并在保持模型泛化能力方面优于LoRA等传统微调方法。我们通过对SD模型的微调实验验证了该方法的有效性，实验结果显示出显著性能提升。SaRA还具有实用优势：仅需单行代码修改即可高效实现，并能与现有方法无缝兼容。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日