Incremental learning (IL) is essential to realize the human-level intelligence in the neural network. However, existing IL scenarios and datasets are unqualified for assessing forgetting in PLMs, giving an illusion that PLMs do not suffer from catastrophic forgetting. To this end, we propose a challenging IL scenario called instance-incremental learning (IIL) and a novel dataset called Concept-1K, which supports an order of magnitude larger IL steps. Based on the experiments on Concept-1K, we reveal that billion-parameter PLMs still suffer from catastrophic forgetting, and the forgetting is affected by both model scale, pretraining, and buffer size. Furthermore, existing IL methods and a popular finetuning technique, LoRA, fail to achieve satisfactory performance. Our study provides a novel scenario for future studies to explore the catastrophic forgetting of PLMs and encourage more powerful techniques to be designed for alleviating the forgetting in PLMs. The data, code and scripts are publicly available at https://github.com/zzz47zzz/pretrained-lm-for-incremental-learning.
翻译:增量学习(IL)是实现类人智能神经网络的关键技术。然而,现有增量学习场景与数据集无法有效评估预训练语言模型(PLMs)的遗忘问题,造成PLMs不受灾难性遗忘影响的假象。为此,我们提出一种具有挑战性的增量学习场景——实例增量学习(IIL),并构建了新型数据集Concept-1K,支持数量级更大的增量学习步骤。基于Concept-1K的实验表明,十亿参数规模的PLMs仍存在严重的灾难性遗忘问题,且遗忘程度同时受模型规模、预训练策略和缓冲区大小影响。此外,现有增量学习方法与主流微调技术LoRA均无法取得令人满意的性能。本研究为探索PLMs灾难性遗忘问题提供了新场景,并激励研究者设计更有效的技术缓解PLMs的遗忘现象。相关数据、代码及脚本已在https://github.com/zzz47zzz/pretrained-lm-for-incremental-learning 开源。