This paper introduces Standard Basis LoRA (SBoRA), a novel parameter-efficient fine-tuning approach for Large Language Models that builds upon the pioneering works of Low-Rank Adaptation (LoRA) and Orthogonal Adaptation. SBoRA reduces the number of trainable parameters by half or doubles the rank with the similar number of trainable parameters as LoRA, while improving learning performance. By utilizing orthogonal standard basis vectors to initialize one of the low-rank matrices (either $\mathbf{A}$ or $\mathbf{B}$), SBoRA facilitates regional weight updates and memory-efficient fine-tuning. This results in two variants, SBoRA-FA and SBoRA-FB, where only one of the matrices is updated, leading to a sparse update matrix $\mathrm{\Delta} \mathbf{W}$ with predominantly zero rows or columns. Consequently, most of the fine-tuned model's weights $(\mathbf{W}_0+\mathrm{\Delta} \mathbf{W})$ remain unchanged from the pre-trained weights, akin to the modular organization of the human brain, which efficiently adapts to new tasks. Our empirical results demonstrate the superiority of SBoRA-FA over LoRA in various fine-tuning tasks, including commonsense reasoning and arithmetic reasoning. Furthermore, we evaluate the effectiveness of QSBoRA on quantized LLaMA models of varying scales, highlighting its potential for efficient adaptation to new tasks. Code is available at https://github.com/cityuhkai/SBoRA
翻译:本文提出标准基低秩自适应(SBoRA),这是一种基于低秩自适应(LoRA)和正交自适应开创性工作的大语言模型参数高效微调新方法。SBoRA在保持与LoRA相近可训练参数量的同时,将可训练参数减少一半或将秩提升一倍,同时提升学习性能。该方法通过使用正交标准基向量初始化其中一个低秩矩阵($\mathbf{A}$或$\mathbf{B}$),实现了区域化权重更新与内存高效的微调。由此产生两种变体SBoRA-FA与SBoRA-FB,其中仅更新单个矩阵,从而生成具有显著零行或零列的稀疏更新矩阵$\mathrm{\Delta} \mathbf{W}$。因此,微调后模型的大部分权重$(\mathbf{W}_0+\mathrm{\Delta} \mathbf{W})$保持与预训练权重一致,类似于人脑的模块化组织机制——这种机制能高效适应新任务。实证结果表明,在常识推理与算术推理等多种微调任务中,SBoRA-FA均优于LoRA。此外,我们评估了QSBoRA在不同规模量化LLaMA模型上的有效性,凸显了其高效适应新任务的潜力。代码发布于https://github.com/cityuhkai/SBoRA