Existing research has shown that large language models (LLMs) exhibit remarkable performance in language understanding and generation. However, when LLMs are continuously fine-tuned on complex and diverse domain-specific downstream tasks, the inference performance on historical tasks decreases dramatically, which is known as a catastrophic forgetting problem. A trade-off needs to be kept between learning plasticity and memory stability. Plenty of existing works have explored strategies like memory replay, regularization and parameter isolation, but little is known about the geometric connection of various adjacent minima in the continual LLMs fine-tuning scenarios. In this work, we investigate the geometric connections of different minima through the lens of mode connectivity, which means different minima can be connected by a low-loss valley. Through extensive experiments, we uncover the mode connectivity phenomenon in the LLMs continual learning scenario and find that it can strike a balance between plasticity and stability. Building upon these findings, we propose a simple yet effective method called Interpolation-based LoRA (I-LoRA), which constructs a dual-memory experience replay framework based on LoRA parameter interpolations. Extensive experiments and analysis on eight domain-specific CL benchmarks demonstrate that I-LoRA consistently show significant improvement over the previous state-of-the-art approaches with up to $11\%$ performance gains, providing a strong baseline and insights for future research on the large language model continual learning problem. Our code is available at \url{https://github.com/which47/LLMCL}.
翻译:摘要:现有研究表明,大语言模型在语言理解与生成方面展现出卓越性能。然而,当大语言模型在复杂多样的领域特定下游任务上持续微调时,其对历史任务的推理性能会急剧下降,这被称为灾难性遗忘问题。需要在学习可塑性与记忆稳定性之间保持权衡。现有大量研究探索了记忆回放、正则化和参数隔离等策略,但关于持续大语言模型微调场景中各类相邻极小值之间的几何关联仍知之甚少。本研究通过模式连通性视角探究不同极小值间的几何联系——即不同极小值可通过低损失谷相连。通过广泛实验,我们揭示了大语言模型持续学习场景中的模式连通现象,并发现其能在可塑性与稳定性之间取得平衡。基于这些发现,我们提出一种简洁有效的插值型LoRA方法,该方法基于LoRA参数插值构建双记忆经验回放框架。在八个领域特定持续学习基准上的大量实验与分析表明,I-LoRA以最高11%的性能提升持续超越先前最先进方法,为未来大语言模型持续学习研究提供了强基线及重要启示。我们的代码开源在\url{https://github.com/which47/LLMCL}。