Many previous models of named entity recognition (NER) suffer from the problem of Out-of-Entity (OOE), i.e., the tokens in the entity mentions of the test samples have not appeared in the training samples, which hinders the achievement of satisfactory performance. To improve OOE-NER performance, in this paper, we propose a new framework, namely S+NER, which fully leverages sentence-level information. Our S+NER achieves better OOE-NER performance mainly due to the following two particular designs. 1) It first exploits the pre-trained language model's capability of understanding the target entity's sentence-level context with a template set. 2) Then, it refines the sentence-level representation based on the positive and negative templates, through a contrastive learning strategy and template pooling method, to obtain better NER results. Our extensive experiments on five benchmark datasets have demonstrated that, our S+NER outperforms some state-of-the-art OOE-NER models.
翻译:以往许多命名实体识别(NER)模型都受外部实体(OOE)问题困扰,即测试样本中实体提及的标记未在训练样本中出现,这阻碍了模型获得令人满意的性能。为提升OOE-NER性能,本文提出一种新框架S+NER,其充分利用句子级信息。S+NER主要通过以下两项设计实现更优的OOE-NER性能:1)首先通过模板集激发预训练语言模型理解目标实体句子级上下文的能力;2)随后基于正负模板,通过对比学习策略与模板池化方法精炼句子级表征,从而获得更好的NER结果。我们在五个基准数据集上的大量实验表明,S+NER优于当前最先进的多项OOE-NER模型。