Narratives include a rich source of events unfolding over time and context. Automatic understanding of these events provides a summarised comprehension of the narrative for further computation (such as reasoning). In this paper, we study the Information Status (IS) of the events and propose a novel challenging task: the automatic identification of new events in a narrative. We define an event as a triplet of subject, predicate, and object. The event is categorized as new with respect to the discourse context and whether it can be inferred through commonsense reasoning. We annotated a publicly available corpus of narratives with the new events at sentence level using human annotators. We present the annotation protocol and study the quality of the annotation and the difficulty of the task. We publish the annotated dataset, annotation materials, and machine learning baseline models for the task of new event extraction for narrative understanding.
翻译:叙事包含随时间与语境展开的丰富事件序列。对这些事件的自动理解能为后续计算(如推理)提供叙事的概要性认知。本文研究事件的信息状态(IS),并提出一项新颖且具有挑战性的任务:自动识别叙事中的新事件。我们将事件定义为主语、谓语、宾语的三元组。根据话语语境及是否可通过常识推理推断,事件被归类为"新"事件。我们采用人工标注方式,在公开的叙事语料库中完成句子级新事件标注。本文展示了标注协议,研究了标注质量及任务难度,并发布了新事件提取任务的标注数据集、标注材料及机器学习基线模型,以服务于叙事理解研究。