Relation tuple extraction from text is an important task for building knowledge bases. Recently, joint entity and relation extraction models have achieved very high F1 scores in this task. However, the experimental settings used by these models are restrictive and the datasets used in the experiments are not realistic. They do not include sentences with zero tuples (zero-cardinality). In this paper, we evaluate the state-of-the-art joint entity and relation extraction models in a more realistic setting. We include sentences that do not contain any tuples in our experiments. Our experiments show that there is significant drop ($\sim 10-15\%$ in one dataset and $\sim 6-14\%$ in another dataset) in their F1 score in this setting. We also propose a two-step modeling using a simple BERT-based classifier that leads to improvement in the overall performance of these models in this realistic experimental setup.
翻译:从文本中提取关系元组是构建知识库的重要任务。近年来,联合实体与关系抽取模型在此任务上取得了非常高的F1分数。然而,这些模型所使用的实验设置存在限制,且实验中采用的数据集不够真实——它们未包含零元组(零基数)的句子。本文在更贴近现实的设置下评估了当前最先进的联合实体与关系抽取模型。我们在实验中纳入了不包含任何元组的句子。实验结果表明,在此设置下,这些模型的F1分数显著下降(在某数据集中约下降10-15%,在另一数据集中约下降6-14%)。我们还提出了一种利用简单BERT分类器的两步建模方法,在该真实实验场景下提升了这些模型的整体性能。