Large language models (LLMs) are not amenable to frequent re-training, due to high training costs arising from their massive scale. However, updates are necessary to endow LLMs with new skills and keep them up-to-date with rapidly evolving human knowledge. This paper surveys recent works on continual learning for LLMs. Due to the unique nature of LLMs, we catalog continue learning techniques in a novel multi-staged categorization scheme, involving continual pretraining, instruction tuning, and alignment. We contrast continual learning for LLMs with simpler adaptation methods used in smaller models, as well as with other enhancement strategies like retrieval-augmented generation and model editing. Moreover, informed by a discussion of benchmarks and evaluation, we identify several challenges and future work directions for this crucial task.
翻译:大型语言模型(LLMs)因其规模庞大导致训练成本高昂,故不适合频繁重新训练。然而,为赋予LLMs新技能并使其与快速演进的人类知识保持同步,更新仍是必要的。本文综述了近期关于LLMs持续学习的研究工作。鉴于LLMs的独特性,我们提出了一种新颖的多阶段分类框架来组织持续学习技术,涵盖持续预训练、指令微调与对齐三个阶段。我们将LLMs的持续学习与小型模型中使用的简化自适应方法进行对比,并区分了包括检索增强生成和模型编辑在内的其他增强策略。此外,基于对基准测试与评估方法的讨论,我们明确了这一关键任务面临的若干挑战及未来研究方向。