Named Entity Recognition (NER) models capable of Continual Learning (CL) are realistically valuable in areas where entity types continuously increase (e.g., personal assistants). Meanwhile the learning paradigm of NER advances to new patterns such as the span-based methods. However, its potential to CL has not been fully explored. In this paper, we propose SpanKL1, a simple yet effective Span-based model with Knowledge distillation (KD) to preserve memories and multi-Label prediction to prevent conflicts in CL-NER. Unlike prior sequence labeling approaches, the inherently independent modeling in span and entity level with the designed coherent optimization on SpanKL promotes its learning at each incremental step and mitigates the forgetting. Experiments on synthetic CL datasets derived from OntoNotes and Few-NERD show that SpanKL significantly outperforms previous SoTA in many aspects, and obtains the smallest gap from CL to the upper bound revealing its high practiced value.
翻译:命名实体识别(NER)模型若具备持续学习(CL)能力,在处理实体类型持续增长的场景(如个人助理)中具有现实价值。与此同时,NER的学习范式正迈向Span-based方法等新模式,但其在持续学习场景中的潜力尚未得到充分挖掘。本文提出SpanKL——一种简洁高效的基于跨度(Span)的模型,通过知识蒸馏(KD)保存记忆,并利用多标签预测防止持续学习NER中的冲突。不同于传统的序列标注方法,SpanKL在跨度与实体层面的内在独立建模搭配精心设计的连贯优化策略,促进了每个增量步骤的学习过程,并有效缓解了遗忘问题。在基于OntoNotes和Few-NERD构建的合成持续学习数据集上的实验表明,SpanKL在多个方面显著超越先前最先进方法,并将持续学习与理论上限之间的差距缩小至最小,凸显了其高实践价值。