Instruction Tuning has the potential to stimulate or enhance specific capabilities of large language models (LLMs). However, achieving the right balance of data is crucial to prevent catastrophic forgetting and interference between tasks. To address these limitations and enhance training flexibility, we propose the Mixture-of-LoRAs (MoA) architecture which is a novel and parameter-efficient tuning method designed for multi-task learning with LLMs. In this paper, we start by individually training multiple domain-specific LoRA modules using corresponding supervised corpus data. These LoRA modules can be aligned with the expert design principles observed in Mixture-of-Experts (MoE). Subsequently, we combine the multiple LoRAs using an explicit routing strategy and introduce domain labels to facilitate multi-task learning, which help prevent interference between tasks and ultimately enhances the performance of each individual task. Furthermore, each LoRA model can be iteratively adapted to a new domain, allowing for quick domain-specific adaptation. Experiments on diverse tasks demonstrate superior and robust performance, which can further promote the wide application of domain-specific LLMs.
翻译:指令微调有望激发或增强大语言模型的特定能力。然而,合理平衡数据至关重要,以防止任务间的灾难性遗忘与相互干扰。为克服上述局限并提升训练灵活性,我们提出Mixture-of-LoRAs(MoA)架构——一种面向大语言模型多任务学习的新型参数高效微调方法。本文首先利用对应的监督语料数据独立训练多个领域特定LoRA模块,这些模块可与混合专家模型中的专家设计原则对齐。随后,我们采用显式路由策略整合多个LoRA,并引入领域标签以促进多任务学习,从而避免任务间干扰并最终提升各单独任务的性能。此外,每个LoRA模型可迭代适配至新领域,实现快速的领域特定适配。跨多个任务的实验表明,该方法具有优越且稳健的性能,可进一步推动领域特定大语言模型的广泛应用。