Large Language Models (LLMs) serve as repositories of extensive world knowledge, enabling them to perform tasks such as question-answering and fact-checking. However, this knowledge can become obsolete as global contexts change. In this paper, we introduce a novel problem in the realm of continual learning: Online Continual Knowledge Learning (OCKL). This problem formulation aims to manage the dynamic nature of world knowledge in LMs under real-time constraints. We propose a new benchmark and evaluation metric designed to measure both the rate of new knowledge acquisition and the retention of previously learned knowledge. Our empirical evaluation, conducted using a variety of state-of-the-art methods, establishes robust base-lines for OCKL. Our results reveal that existing continual learning approaches are unfortunately insufficient for tackling the unique challenges posed by OCKL. We identify key factors that influence the trade-off between knowledge acquisition and retention, thereby advancing our understanding of how to train LMs in a continually evolving environment.
翻译:大型语言模型(LLMs)作为海量世界知识的存储库,能够执行问答、事实核查等任务。然而,随着全球背景的变化,这些知识可能会过时。本文提出持续学习领域中的一个新问题:在线持续知识学习(OCKL)。该问题形式化旨在管理语言模型中世界知识在实时约束下的动态特性。我们提出了一套新的基准测试与评估指标,用于同时衡量新知识获取速率与已学知识保留程度。基于多种前沿方法的实证评估,我们为OCKL建立了稳健的基线。研究结果表明,现有持续学习方法不足以应对OCKL带来的独特挑战。我们识别了影响知识获取与保留权衡的关键因素,从而深化了对如何在持续演化环境中训练语言模型的理解。