Existing continual learning (CL) methods mainly rely on fine-tuning or adapting large language models (LLMs). They still suffer from catastrophic forgetting (CF). Little work has been done to exploit in-context learning (ICL) to leverage the extensive knowledge within LLMs for CL without updating any parameters. However, incrementally learning each new task in ICL necessitates adding training examples from each class of the task to the prompt, which hampers scalability as the prompt length increases. This issue not only leads to excessively long prompts that exceed the input token limit of the underlying LLM but also degrades the model's performance due to the overextended context. To address this, we introduce InCA, a novel approach that integrates an external continual learner (ECL) with ICL to enable scalable CL without CF. The ECL is built incrementally to pre-select a small subset of likely classes for each test instance. By restricting the ICL prompt to only these selected classes, InCA prevents prompt lengths from becoming excessively long, while maintaining high performance. Experimental results demonstrate that InCA significantly outperforms existing CL baselines, achieving substantial performance gains.
翻译:现有的持续学习方法主要依赖于对大型语言模型的微调或适配,这些方法仍受困于灾难性遗忘问题。目前鲜有研究探索如何利用上下文学习在不更新任何参数的情况下,借助LLM内部的海量知识实现持续学习。然而,在上下文学习中增量学习每个新任务时,需要将任务中每个类别的训练样本加入提示词,这会导致提示词长度随任务增长而不断扩展,从而影响方法的可扩展性。该问题不仅会使提示词长度超出底层LLM的输入标记限制,还会因上下文过长而导致模型性能下降。为解决这一问题,我们提出了InCA——一种将外部持续学习器与上下文学习相结合的新方法,该方法能够实现可扩展且无灾难性遗忘的持续学习。ECL通过增量式构建,可为每个测试实例预选一个较小的候选类别子集。通过将ICL提示词限定在这些选定类别范围内,InCA既能避免提示词长度过度增长,又能保持优异性能。实验结果表明,InCA显著优于现有持续学习基线方法,取得了实质性的性能提升。