Named Entity Recognition (NER) models capable of Continual Learning (CL) are realistically valuable in areas where entity types continuously increase (e.g., personal assistants). Meanwhile the learning paradigm of NER advances to new patterns such as the span-based methods. However, its potential to CL has not been fully explored. In this paper, we propose SpanKL, a simple yet effective Span-based model with Knowledge distillation (KD) to preserve memories and multi-Label prediction to prevent conflicts in CL-NER. Unlike prior sequence labeling approaches, the inherently independent modeling in span and entity level with the designed coherent optimization on SpanKL promotes its learning at each incremental step and mitigates the forgetting. Experiments on synthetic CL datasets derived from OntoNotes and Few-NERD show that SpanKL significantly outperforms previous SoTA in many aspects, and obtains the smallest gap from CL to the upper bound revealing its high practiced value. The code is available at https://github.com/Qznan/SpanKL.
翻译:命名实体识别( Named Entity Recognition,NER )模型能够在实体类型持续增加(如个人助理)的领域中进行持续学习( Continual Learning,CL )具有实际价值。与此同时,NER的学习范式正向基于跨度方法等新模式演进,但其在CL中的潜力尚未得到充分探索。本文提出SpanKL——一种简单而有效的基于跨度的模型,通过知识蒸馏( Knowledge distillation,KD )保留记忆,并通过多标签预测避免CL-NER中的冲突。与以往的序列标注方法不同,SpanKL在跨度和实体层面固有的独立建模与设计的连贯优化,促进了其在每个增量步骤中的学习并缓解了遗忘。基于OntoNotes和Few-NERD构建的合成CL数据集实验表明,SpanKL在多个方面显著优于先前的最优方法,并获得了从CL到上界的最小差距,揭示了其高度的实践价值。代码已开源在 https://github.com/Qznan/SpanKL。