Incremental learning (IL) is essential to realize the human-level intelligence in the neural network. However, existing IL scenarios and datasets are unqualified for assessing forgetting in PLMs, giving an illusion that PLMs do not suffer from catastrophic forgetting. To this end, we propose a challenging IL scenario called instance-incremental learning (IIL) and a novel dataset called Concept-1K, which supports an order of magnitude larger IL steps. Based on the experiments on Concept-1K, we reveal that billion-parameter PLMs still suffer from catastrophic forgetting, and the forgetting is affected by both model scale, pretraining, and buffer size. Furthermore, existing IL methods and a popular finetuning technique, LoRA, fail to achieve satisfactory performance. Our study provides a novel scenario for future studies to explore the catastrophic forgetting of PLMs and encourage more powerful techniques to be designed for alleviating the forgetting in PLMs. The data, code and scripts are publicly available at https://github.com/zzz47zzz/codebase-for-incremental-learning-with-llm.
翻译:增量学习(IL)是实现神经网络类人智能的关键。然而,现有增量学习场景与数据集无法有效评估大语言模型(PLMs)的遗忘问题,导致学界产生“PLMs不存在灾难性遗忘”的假象。为此,我们提出名为“实例增量学习(IIL)”的挑战性框架,并构建新型数据集Concept-1K,支持高出一个数量级的增量步骤。基于Concept-1K的实验表明,十亿参数规模的PLMs仍存在灾难性遗忘,且遗忘程度受模型规模、预训练策略和缓冲区大小的共同影响。此外,现有增量学习方法及主流微调技术LoRA均未能取得满意效果。本研究为探索PLMs灾难性遗忘提供了新场景,有望推动开发更有效的抗遗忘技术。相关数据、代码及脚本已开源至https://github.com/zzz47zzz/codebase-for-incremental-learning-with-llm。