We present a novel Parameter-Efficient Fine-Tuning (PEFT) method, dubbed as Adaptive Freezing of Low Rank Adaptation (AFLoRA). Specifically, for each pre-trained frozen weight tensor, we add a parallel path of trainable low-rank matrices, namely a down-projection and an up-projection matrix, each of which is followed by a feature transformation vector. Based on a novel freezing score, we the incrementally freeze these projection matrices during fine-tuning to reduce the computation and alleviate over-fitting. Our experimental results demonstrate that we can achieve state-of-the-art performance with an average improvement of up to $0.85\%$ as evaluated on GLUE benchmark while yeilding up to $9.5\times$ fewer average trainable parameters. While compared in terms of runtime, AFLoRA can yield up to $1.86\times$ improvement as opposed to similar PEFT alternatives. Besides the practical utility of our approach, we provide insights on the trainability requirements of LoRA paths at different modules and the freezing schedule for the different projection matrices. Code will be released.
翻译:我们提出了一种新颖的参数高效微调方法,称为自适应冻结低秩自适应。具体而言,对于每个预训练冻结权重张量,我们添加一条由可训练低秩矩阵组成的并行路径,即一个下投影矩阵和一个上投影矩阵,每个矩阵后接一个特征变换向量。基于一种新颖的冻结分数,我们在微调过程中逐步冻结这些投影矩阵,以减少计算量并缓解过拟合。实验结果表明,在GLUE基准测试中,我们能够实现平均提升高达0.85%的最优性能,同时平均可训练参数减少高达9.5倍。在运行时方面,与类似的PEFT方案相比,AFLoRA可带来高达1.86倍的性能提升。除了该方法的应用实用性外,我们还提供了关于不同模块中LoRA路径的可训练性要求以及不同投影矩阵的冻结调度策略的见解。代码将公开提供。