Large language models (LLMs) are not amenable to frequent re-training, due to high training costs arising from their massive scale. However, updates are necessary to endow LLMs with new skills and keep them up-to-date with rapidly evolving human knowledge. This paper surveys recent works on continual learning for LLMs. Due to the unique nature of LLMs, we catalog continue learning techniques in a novel multi-staged categorization scheme, involving continual pretraining, instruction tuning, and alignment. We contrast continual learning for LLMs with simpler adaptation methods used in smaller models, as well as with other enhancement strategies like retrieval-augmented generation and model editing. Moreover, informed by a discussion of benchmarks and evaluation, we identify several challenges and future work directions for this crucial task.
翻译:大型语言模型由于其规模庞大导致训练成本高昂,因此不适合频繁重新训练。然而,为使大型语言模型具备新技能并使其与快速演进的人类知识保持同步,模型更新是必要的。本文综述了近期关于大型语言模型持续学习的研究工作。鉴于大型语言模型的独特性质,我们提出了一种创新的多阶段分类方案来整理持续学习技术,该方案涵盖持续预训练、指令微调和对齐三个阶段。我们将大型语言模型的持续学习与小型模型中使用的更简单的自适应方法进行对比,同时还将其与检索增强生成和模型编辑等其他增强策略进行比较。此外,基于对基准测试和评估的讨论,我们指出了这一关键任务面临的若干挑战和未来研究方向。