Parameter-efficient fine-tuning stands as the standard for efficiently fine-tuning large language and vision models on downstream tasks. Specifically, the efficiency of low-rank adaptation has facilitated the creation and sharing of hundreds of custom LoRA modules, each trained on distinct data from various downstream tasks. In this paper, we explore the composability of LoRA modules, examining if combining these pre-trained modules enhances generalization to unseen downstream tasks. Our investigation involves evaluating two approaches: (a) uniform composition, involving averaging upstream LoRA modules with equal weights, and (b) learned composition, where we learn the weights for each upstream module and perform weighted averaging. Our experimental results on both vision and language models reveal that in few-shot settings, where only a limited number of samples are available for the downstream task, both uniform and learned composition methods result in better transfer accuracy; outperforming full fine-tuning and training a LoRA from scratch. Moreover, in full-shot settings, learned composition performs comparably to regular LoRA training with significantly fewer number of trainable parameters. Our research unveils the potential of uniform composition for enhancing transferability in low-shot settings, without introducing additional learnable parameters.
翻译:参数高效微调已成为在目标任务上高效微调大型语言模型和视觉模型的标准方法。具体而言,低秩适配的效率促进了数百个定制LoRA模块的创建与共享,每个模块针对不同目标任务的独特数据训练而成。本文探索LoRA模块的组合能力,研究合并这些预训练模块是否能增强对未见目标任务的泛化能力。我们的评估涉及两种方法:(a) 均匀组合,即对上游LoRA模块进行等权平均;(b) 学习组合,即学习每个上游模块的权重并执行加权平均。在视觉模型与语言模型上的实验结果表明,在少样本场景下(仅有有限数量的目标任务样本可用),均匀组合与学习组合方法均能实现更好的迁移准确率,表现优于全模型微调及从头训练LoRA。此外,在全样本场景中,学习组合以显著更少的可训练参数达到了与常规LoRA训练相当的效果。我们的研究揭示了均匀组合在低样本场景下提升迁移能力的潜力,且无需引入额外可学习参数。