Adapting Large Language Models (LLMs) to new tasks through fine-tuning has been made more efficient by the introduction of Parameter-Efficient Fine-Tuning (PEFT) techniques, such as LoRA. However, these methods often underperform compared to full fine-tuning, particularly in scenarios involving complex datasets. This issue becomes even more pronounced in complex domains, highlighting the need for improved PEFT approaches that can achieve better performance. Through a series of experiments, we have uncovered two critical insights that shed light on the training and parameter inefficiency of LoRA. Building on these insights, we have developed HydraLoRA, a LoRA framework with an asymmetric structure that eliminates the need for domain expertise. Our experiments demonstrate that HydraLoRA outperforms other PEFT approaches, even those that rely on domain knowledge during the training and inference phases. \href{https://github.com/Clin0212/HydraLoRA}{Code}.
翻译:通过引入参数高效微调(PEFT)技术(如LoRA),将大型语言模型(LLMs)适配至新任务的过程已变得更加高效。然而,这些方法在涉及复杂数据集的场景中,其性能通常低于全参数微调。这一现象在复杂领域尤为显著,凸显出亟需开发能实现更优性能的改进型PEFT方法。通过一系列实验,我们揭示了导致LoRA训练效率与参数效率不足的两个关键发现。基于这些发现,我们设计了HydraLoRA——一种无需领域知识即可应用的非对称结构LoRA框架。实验结果表明,HydraLoRA在性能上超越了其他PEFT方法,甚至包括那些在训练与推理阶段依赖领域知识的方法。\href{https://github.com/Clin0212/HydraLoRA}{代码}。