The rapid advancements in large language models (LLMs) have revolutionized natural language processing, creating an increased need for efficient, task-specific fine-tuning methods. Traditional fine-tuning of LLMs involves updating a large number of parameters, which is computationally expensive and memory-intensive. Low-Rank Adaptation (LoRA) has emerged as a promising solution, enabling parameter-efficient fine-tuning by reducing the number of trainable parameters. However, while LoRA reduces the number of trainable parameters, LoRA modules still create significant storage challenges. We propose LoRA-Mini, an optimized adaptation of LoRA that improves parameter efficiency by splitting low-rank matrices into four parts, with only the two inner matrices being trainable. This approach achieves upto a 20x reduction compared to standard LoRA in the number of trainable parameters while preserving performance levels comparable to standard LoRA, addressing both computational and storage efficiency in LLM fine-tuning.
翻译:大型语言模型(LLM)的快速发展彻底改变了自然语言处理领域,同时也催生了对高效、任务特定微调方法的迫切需求。传统的LLM微调需要更新海量参数,这导致计算成本高昂且内存占用巨大。低秩适配(LoRA)作为一种前景广阔的解决方案应运而生,它通过减少可训练参数的数量实现了参数高效的微调。然而,尽管LoRA减少了可训练参数的数量,其模块仍会带来显著的存储挑战。我们提出LoRA-Mini,这是一种对LoRA的优化改进,通过将低秩矩阵拆分为四个部分,并仅使其中两个内部矩阵可训练,从而提升了参数效率。该方法在保持与标准LoRA相当性能水平的同时,相比标准LoRA实现了高达20倍的可训练参数削减,有效解决了LLM微调中的计算与存储效率问题。