Parameter-efficient fine-tuning (PEFT) is widely studied for its effectiveness and efficiency in the era of large language models. Low-rank adaptation (LoRA) has demonstrated commendable performance as a popular and representative method. However, it is implemented with a fixed intrinsic rank that might not be the ideal setting for the downstream tasks. Recognizing the need for more flexible downstream task adaptation, we extend the methodology of LoRA to an innovative approach we call allocating low-rank adaptation (ALoRA) that enables dynamic adjustments to the intrinsic rank during the adaptation process. First, we propose a novel method, AB-LoRA, that can effectively estimate the importance score of each LoRA rank. Second, guided by AB-LoRA, we gradually prune abundant and negatively impacting LoRA ranks and allocate the pruned LoRA budgets to important Transformer modules needing higher ranks. We have conducted experiments on various tasks, and the experimental results demonstrate that our ALoRA method can outperform the recent baselines with comparable tunable parameters.
翻译:参数高效微调因其在大语言模型时代的高效性和有效性而被广泛研究。低秩适配作为代表性方法展现了显著性能。然而,其采用固定本征秩的实现方式可能并非下游任务的最优设置。为满足更灵活的下游任务适配需求,我们将LoRA方法拓展为名为分配低秩适配的创新方法,该方法能够在适配过程中动态调整本征秩。首先,我们提出AB-LoRA新方法,可有效评估每个LoRA秩的重要性得分。其次,在AB-LoRA指导下,逐步剪除冗余及负面影响较大的LoRA秩,并将释放的LoRA预算分配给需要更高秩的重要Transformer模块。我们在多种任务上进行了实验,结果表明,我们的ALoRA方法在可调参数相当的情况下能够超越近期基线方法。