Relation extraction is a central task in natural language processing (NLP) and information retrieval (IR) research. We argue that an important type of relation not explored in NLP or IR research to date is that of an event being an argument - required or optional - of another event. We introduce the human-annotated Event Dependency Relation dataset (EDeR) which provides this dependency relation. The annotation is done on a sample of documents from the OntoNotes dataset, which has the added benefit that it integrates with existing, orthogonal, annotations of this dataset. We investigate baseline approaches for predicting the event dependency relation, the best of which achieves an accuracy of 82.61 for binary argument/non-argument classification. We show that recognizing this relation leads to more accurate event extraction (semantic role labelling) and can improve downstream tasks that depend on this, such as co-reference resolution. Furthermore, we demonstrate that predicting the three-way classification into the required argument, optional argument or non-argument is a more challenging task.
翻译:关系抽取是自然语言处理(NLP)和信息检索(IR)研究的核心任务。我们指出,NLP或IR研究中至今未涉及的一类重要关系是:某事件作为另一事件的论元(必选或可选)所形成的关系。我们介绍了人工标注的事件依赖关系数据集(EDeR),该数据集提供此类依赖关系。标注工作基于OntoNotes数据集中的文档样本,其附加优势在于可与该数据集的现有正交标注体系相集成。我们探究了预测事件依赖关系的基线方法,其中最优方法在二元论元/非论元分类中达到82.61%的准确率。研究表明,识别此类关系可提升事件抽取(语义角色标注)的准确性,并改进依赖于此的下游任务(如共指消解)。此外,我们证明,对必选论元、可选论元与非论元进行三分类预测是一项更具挑战性的任务。