Low-Rank Adaptation (LoRA) methods have emerged as crucial techniques for adapting large pre-trained models to downstream tasks under computational and memory constraints. However, they face a fundamental challenge in balancing task-specific performance gains against catastrophic forgetting of pre-trained knowledge, where existing methods provide inconsistent recommendations. This paper presents a comprehensive analysis of the performance-forgetting trade-offs inherent in low-rank adaptation using principal components as initialization. Our investigation reveals that fine-tuning intermediate components leads to better balance and show more robustness to high learning rates than first (PiSSA) and last (MiLoRA) components in existing work. Building on these findings, we provide a practical approach for initialization of LoRA that offers superior trade-offs. We demonstrate in a thorough empirical study on a variety of computer vision and NLP tasks that our approach improves accuracy and reduces forgetting, also in continual learning scenarios.
翻译:低秩适应(LoRA)方法已成为在计算和内存限制下将大型预训练模型适配到下游任务的关键技术。然而,它们在平衡任务特定性能提升与预训练知识的灾难性遗忘方面面临根本性挑战,现有方法对此给出了不一致的建议。本文对使用主成分作为初始化的低秩适应中固有的性能-遗忘权衡进行了全面分析。我们的研究表明,与现有工作中的首个(PiSSA)和末个(MiLoRA)成分相比,微调中间成分能带来更好的平衡,并对高学习率表现出更强的鲁棒性。基于这些发现,我们提出了一种实用的LoRA初始化方法,可提供更优的权衡。我们在多种计算机视觉和NLP任务上进行的详尽实证研究表明,即使在持续学习场景中,我们的方法也能提高准确性并减少遗忘。