As large language models (LLMs) grow in size, traditional full fine-tuning becomes increasingly impractical due to its high computational and storage costs. Although popular parameter-efficient fine-tuning methods, such as LoRA, have significantly reduced the number of tunable parameters, there is still room for further optimization. In this work, we propose ASLoRA, a cross-layer parameter-sharing strategy combining global sharing with partial adaptive sharing. Specifically, we share the low-rank matrix A across all layers and adaptively merge matrix B during training. This sharing mechanism not only mitigates overfitting effectively but also captures inter-layer dependencies, significantly enhancing the model's representational capability. We conduct extensive experiments on various NLP tasks, showing that ASLoRA outperforms LoRA while using less than 25% of the parameters, highlighting its flexibility and superior parameter efficiency. Furthermore, in-depth analyses of the adaptive sharing strategy confirm its significant advantages in enhancing both model flexibility and task adaptability.
翻译:随着大语言模型(LLM)规模的不断增长,传统的全参数微调因其高昂的计算和存储成本而日益不切实际。尽管目前流行的参数高效微调方法(如LoRA)已显著减少了可调参数数量,但仍有进一步优化的空间。本文提出ASLoRA,一种结合全局共享与部分自适应共享的跨层参数共享策略。具体而言,我们在所有层间共享低秩矩阵A,并在训练过程中自适应地合并矩阵B。这种共享机制不仅能有效缓解过拟合,还能捕捉层间依赖关系,显著增强模型的表示能力。我们在多种自然语言处理任务上进行了大量实验,结果表明ASLoRA在使用少于25%参数的情况下性能优于LoRA,突显了其灵活性和卓越的参数效率。此外,对自适应共享策略的深入分析证实了其在提升模型灵活性和任务适应性方面具有显著优势。