Recent span-based joint extraction models have demonstrated significant advantages in both entity recognition and relation extraction. These models treat text spans as candidate entities, and span pairs as candidate relationship tuples, achieving state-of-the-art results on datasets like ADE. However, these models encounter a significant number of non-entity spans or irrelevant span pairs during the tasks, impairing model performance significantly. To address this issue, this paper introduces a span-based multitask entity-relation joint extraction model. This approach employs the multitask learning to alleviate the impact of negative samples on entity and relation classifiers. Additionally, we leverage the Intersection over Union(IoU) concept to introduce the positional information into the entity classifier, achieving a span boundary detection. Furthermore, by incorporating the entity Logits predicted by the entity classifier into the embedded representation of entity pairs, the semantic input for the relation classifier is enriched. Experimental results demonstrate that our proposed SpERT.MT model can effectively mitigate the adverse effects of excessive negative samples on the model performance. Furthermore, the model demonstrated commendable F1 scores of 73.61\%, 53.72\%, and 83.72\% on three widely employed public datasets, namely CoNLL04, SciERC, and ADE, respectively.
翻译:近期基于跨度的联合抽取模型在实体识别与关系抽取任务中展现出显著优势。这些模型将文本跨度视为候选实体,跨度对视为候选关系元组,在ADE等数据集上取得了最优结果。然而,此类模型在执行任务时会遇到大量非实体跨度或无关跨度对,严重损害模型性能。针对该问题,本文提出一种基于跨度的多任务实体关系联合抽取模型。该方法采用多任务学习来缓解负样本对实体分类器与关系分类器的影响。此外,我们利用交并比(IoU)概念将位置信息引入实体分类器,实现跨度边界检测。进一步地,通过将实体分类器预测的实体逻辑值(Logits)融入实体对的嵌入表示中,丰富了关系分类器的语义输入。实验结果表明,我们提出的SpERT.MT模型能有效缓解过多负样本对模型性能的不利影响。该模型在CoNLL04、SciERC和ADE三个广泛使用的公开数据集上分别取得了73.61%、53.72%和83.72%的优异F1分数。