As the applications of large language models (LLMs) expand across diverse fields, the ability of these models to adapt to ongoing changes in data, tasks, and user preferences becomes crucial. Traditional training methods, relying on static datasets, are increasingly inadequate for coping with the dynamic nature of real-world information. Lifelong learning, also known as continual or incremental learning, addresses this challenge by enabling LLMs to learn continuously and adaptively over their operational lifetime, integrating new knowledge while retaining previously learned information and preventing catastrophic forgetting. This survey delves into the sophisticated landscape of lifelong learning, categorizing strategies into two primary groups: Internal Knowledge and External Knowledge. Internal Knowledge includes continual pretraining and continual finetuning, each enhancing the adaptability of LLMs in various scenarios. External Knowledge encompasses retrieval-based and tool-based lifelong learning, leveraging external data sources and computational tools to extend the model's capabilities without modifying core parameters. The key contributions of our survey are: (1) Introducing a novel taxonomy categorizing the extensive literature of lifelong learning into 12 scenarios; (2) Identifying common techniques across all lifelong learning scenarios and classifying existing literature into various technique groups within each scenario; (3) Highlighting emerging techniques such as model expansion and data selection, which were less explored in the pre-LLM era. Through a detailed examination of these groups and their respective categories, this survey aims to enhance the adaptability, reliability, and overall performance of LLMs in real-world applications.
翻译:随着大语言模型(LLM)在各领域的应用不断扩展,这些模型适应数据、任务和用户偏好持续变化的能力变得至关重要。传统依赖静态数据集的训练方法日益难以应对现实世界信息的动态特性。终身学习(亦称持续学习或增量学习)通过使大语言模型在运行周期内持续自适应学习,既能整合新知识又能保留已学信息并防止灾难性遗忘,从而应对这一挑战。本综述深入剖析终身学习的复杂体系,将策略划分为两大类别:内部知识与外部知识。内部知识包含持续预训练和持续微调,两者分别增强了大语言模型在不同场景下的适应性;外部知识涵盖基于检索和基于工具的终身学习,通过利用外部数据源和计算工具扩展模型能力而无需修改核心参数。本综述的主要贡献包括:(1) 提出新型分类体系,将丰富的终身学习文献归纳为12种场景;(2) 识别所有终身学习场景中的通用技术,并将现有文献归类为各场景下的技术组别;(3) 重点阐述模型扩展、数据选择等在大语言模型时代之前较少探索的新兴技术。通过对这些技术组别及其子类别的详细剖析,本综述旨在提升大语言模型在现实应用中的适应性、可靠性和整体性能。