Narratives include a rich source of events unfolding over time and context. Automatic understanding of these events may provide a summarised comprehension of the narrative for further computation (such as reasoning). In this paper, we study the Information Status (IS) of the events and propose a novel challenging task: the automatic identification of new events in a narrative. We define an event as a triplet of subject, predicate, and object. The event is categorized as new with respect to the discourse context and whether it can be inferred through commonsense reasoning. We annotated a publicly available corpus of narratives with the new events at sentence level using human annotators. We present the annotation protocol and a study aiming at validating the quality of the annotation and the difficulty of the task. We publish the annotated dataset, annotation materials, and machine learning baseline models for the task of new event extraction for narrative understanding.
翻译:叙事包含随时间与语境展开的丰富事件源。对这些事件的自动理解可为后续计算(如推理)提供叙事的概括性解读。本文从事件的信息状态(Information Status, IS)出发,提出一项新颖且具有挑战性的任务:自动识别叙事中的新事件。我们将事件定义为主语、谓语和宾语的三元组。事件根据其相对于话语语境的新颖性以及是否可通过常识推理推断而被归类为"新事件"。我们利用人工标注员在句子级别对公开可用的叙事语料库进行了新事件标注。我们介绍了标注协议以及一项旨在验证标注质量与任务难度的研究。我们公开发布了标注数据集、标注材料以及面向叙事理解中新事件提取任务的机器学习基线模型。