Named entity recognition is a traditional task in natural language processing. In particular, nested entity recognition receives extensive attention for the widespread existence of the nesting scenario. The latest research migrates the well-established paradigm of set prediction in object detection to cope with entity nesting. However, the manual creation of query vectors, which fail to adapt to the rich semantic information in the context, limits these approaches. An end-to-end entity detection approach with proposer and regressor is presented in this paper to tackle the issues. First, the proposer utilizes the feature pyramid network to generate high-quality entity proposals. Then, the regressor refines the proposals for generating the final prediction. The model adopts encoder-only architecture and thus obtains the advantages of the richness of query semantics, high precision of entity localization, and easiness of model training. Moreover, we introduce the novel spatially modulated attention and progressive refinement for further improvement. Extensive experiments demonstrate that our model achieves advanced performance in flat and nested NER, achieving a new state-of-the-art F1 score of 80.74 on the GENIA dataset and 72.38 on the WeiboNER dataset.
翻译:命名实体识别是自然语言处理中的一项传统任务。特别地,由于嵌套场景的广泛存在,嵌套实体识别受到了广泛关注。最新研究借鉴了目标检测中成熟的集合预测范式来处理实体嵌套问题。然而,手动创建查询向量的方法无法适应上下文中丰富的语义信息,从而限制了这些方法的性能。本文提出了一种使用提议器和回归器的端到端实体检测方法来解决这些问题。首先,提议器利用特征金字塔网络生成高质量的实体提议。然后,回归器对提议进行优化以生成最终预测。该模型采用仅编码器架构,因此具有查询语义丰富、实体定位精度高以及模型训练简便等优势。此外,我们引入了新颖的空间调制注意力机制和渐进式优化以进一步提升性能。大量实验表明,我们的模型在平面和嵌套命名实体识别中均达到了先进水平,在GENIA数据集上取得了80.74的新最高F1分数,在WeiboNER数据集上取得了72.38的新最高F1分数。