The incremental sequence labeling task involves continuously learning new classes over time while retaining knowledge of the previous ones. Our investigation identifies two significant semantic shifts: E2O (where the model mislabels an old entity as a non-entity) and O2E (where the model labels a non-entity or old entity as a new entity). Previous research has predominantly focused on addressing the E2O problem, neglecting the O2E issue. This negligence results in a model bias towards classifying new data samples as belonging to the new class during the learning process. To address these challenges, we propose a novel framework, Incremental Sequential Labeling without Semantic Shifts (IS3). Motivated by the identified semantic shifts (E2O and O2E), IS3 aims to mitigate catastrophic forgetting in models. As for the E2O problem, we use knowledge distillation to maintain the model's discriminative ability for old entities. Simultaneously, to tackle the O2E problem, we alleviate the model's bias towards new entities through debiased loss and optimization levels. Our experimental evaluation, conducted on three datasets with various incremental settings, demonstrates the superior performance of IS3 compared to the previous state-of-the-art method by a significant margin.The data, code, and scripts are publicly available at https://github.com/zzz47zzz/codebase-for-incremental-learning-with-llm.
翻译:增量序列标注任务涉及在持续学习新类别的同时保留对先前类别的知识。我们的研究发现存在两种显著的语义偏移:E2O(模型将旧实体误标注为非实体)和O2E(模型将非实体或旧实体标注为新实体)。先前研究主要集中于解决E2O问题,而忽视了O2E问题。这种忽视导致模型在学习过程中倾向于将新数据样本分类为属于新类别。为应对这些挑战,我们提出了一种新颖框架——无语义偏移的增量序列标注(IS3)。基于所识别的语义偏移(E2O与O2E),IS3旨在缓解模型中的灾难性遗忘问题。针对E2O问题,我们采用知识蒸馏技术来保持模型对旧实体的判别能力;同时,为处理O2E问题,我们通过去偏损失函数和优化层级设计来减轻模型对新实体的偏向。我们在三种数据集上采用多种增量设置进行的实验评估表明,IS3的性能显著优于先前的最先进方法。相关数据、代码及脚本已公开于https://github.com/zzz47zzz/codebase-for-incremental-learning-with-llm。