Named Entity Recognition (NER) aims to extract and classify entity mentions in the text into pre-defined types (e.g., organization or person name). Recently, many works have been proposed to shape the NER as a machine reading comprehension problem (also termed MRC-based NER), in which entity recognition is achieved by answering the formulated questions related to pre-defined entity types through MRC, based on the contexts. However, these works ignore the label dependencies among entity types, which are critical for precisely recognizing named entities. In this paper, we propose to incorporate the label dependencies among entity types into a multi-task learning framework for better MRC-based NER. We decompose MRC-based NER into multiple tasks and use a self-attention module to capture label dependencies. Comprehensive experiments on both nested NER and flat NER datasets are conducted to validate the effectiveness of the proposed Multi-NER. Experimental results show that Multi-NER can achieve better performance on all datasets.
翻译:命名实体识别(NER)旨在从文本中提取并分类实体提及项到预定义类型(如组织机构名或人名)。近年来,许多研究将NER构建为机器阅读理解问题(也称为基于MRC的NER),该方法通过MRC模型基于上下文回答与预定义实体类型相关的问题来实现实体识别。然而,这些工作忽略了实体类型间的标签依赖关系,而该关系对于精确识别命名实体至关重要。本文提出将实体类型间的标签依赖关系融入多任务学习框架,以改进基于MRC的NER。我们将基于MRC的NER分解为多个任务,并采用自注意力模块捕获标签依赖关系。通过在嵌套NER和平面NER数据集上的综合实验,验证了所提出的Multi-NER的有效性。实验结果表明,Multi-NER在所有数据集上均能取得更优性能。