With the increasing size of pre-trained language models (PLMs), fine-tuning all the parameters in the model is not efficient, especially when there are a large number of downstream tasks, which incur significant training and storage costs. Many parameter-efficient fine-tuning (PEFT) approaches have been proposed, among which, Low-Rank Adaptation (LoRA) is a representative approach that injects trainable rank decomposition matrices into every target module. Yet LoRA ignores the importance of parameters in different modules. To address this problem, many works have been proposed to prune the parameters of LoRA. However, under limited training conditions, the upper bound of the rank of the pruned parameter matrix is still affected by the preset values. We, therefore, propose IncreLoRA, an incremental parameter allocation method that adaptively adds trainable parameters during training based on the importance scores of each module. This approach is different from the pruning method as it is not limited by the initial number of training parameters, and each parameter matrix has a higher rank upper bound for the same training overhead. We conduct extensive experiments on GLUE to demonstrate the effectiveness of IncreLoRA. The results show that our method owns higher parameter efficiency, especially when under the low-resource settings where our method significantly outperforms the baselines. Our code is publicly available.
翻译:随着预训练语言模型规模的不断扩大,全参数微调方法在面临大量下游任务时效率低下,会产生显著的训练与存储成本。近年来涌现出多种参数高效微调方法,其中低秩自适应(LoRA)作为代表性方法,通过在每个目标模块中注入可训练的低秩分解矩阵实现参数高效微调。然而LoRA忽略了不同模块间参数重要性的差异。针对该问题,现有研究提出对LoRA参数进行剪枝。但在有限训练条件下,剪枝后参数矩阵的秩上界仍受预设值制约。为此,我们提出IncreLoRA——一种基于各模块重要性分数在训练过程中自适应增加可训练参数的增量式参数分配方法。该方法区别于剪枝策略,不受初始训练参数量的限制,在相同训练开销下每个参数矩阵可获得更高的秩上界。我们在GLUE基准上开展大量实验验证IncreLoRA的有效性,结果表明该方法具有更高的参数效率,尤其在低资源设置下显著超越基线方法。我们的代码已开源。