Most existing work on event extraction has focused on sentence-level texts and presumes the identification of a trigger-span -- a word or phrase in the input that evokes the occurrence of an event of interest. Event arguments are then extracted with respect to the trigger. Indeed, triggers are treated as integral to, and trigger detection as an essential component of, event extraction. In this paper, we provide the first investigation of the role of triggers for the more difficult and much less studied task of document-level event extraction. We analyze their usefulness in multiple end-to-end and pipelined neural event extraction models for three document-level event extraction datasets, measuring performance using triggers of varying quality (human-annotated, LLM-generated, keyword-based, and random). Our research shows that trigger effectiveness varies based on the extraction task's characteristics and data quality, with basic, automatically-generated triggers serving as a viable alternative to human-annotated ones. Furthermore, providing detailed event descriptions to the extraction model helps maintain robust performance even when trigger quality degrades. Perhaps surprisingly, we also find that the mere existence of trigger input, even random ones, is important for prompt-based LLM approaches to the task.
翻译:现有的事件抽取研究大多聚焦于句子级文本,并预设了触发器跨度(trigger-span)的识别——即输入文本中唤起目标事件发生的词或短语。事件论元随后围绕该触发器进行抽取。事实上,触发器被视为事件抽取不可或缺的部分,而触发器检测则被视作其核心环节。本文首次针对更具挑战性且研究较少的文档级事件抽取任务,探讨触发器的作用。我们在三个文档级事件抽取数据集上,通过多种端到端及流水线式神经事件抽取模型,分析了不同质量触发器(人工标注、大语言模型生成、基于关键词及随机生成)的有效性。研究表明,触发器的有效性取决于抽取任务特性与数据质量,其中基础的自动生成触发器可作为人工标注触发器的可行替代方案。此外,向抽取模型提供详细的事件描述有助于在触发器质量下降时保持稳健性能。值得注意的是,我们发现即使对于基于提示的大语言模型方法而言,触发器输入的存在本身(即便是随机触发器)也具有重要作用。