Cascading Large Language Models for Salient Event Graph Generation

Generating event graphs from long documents is challenging due to the inherent complexity of multiple tasks involved such as detecting events, identifying their relationships, and reconciling unstructured input with structured graphs. Recent studies typically consider all events with equal importance, failing to distinguish salient events crucial for understanding narratives. This paper presents CALLMSAE, a CAscading Large Language Model framework for SAlient Event graph generation, which leverages the capabilities of LLMs and eliminates the need for costly human annotations. We first identify salient events by prompting LLMs to generate summaries, from which salient events are identified. Next, we develop an iterative code refinement prompting strategy to generate event relation graphs, removing hallucinated relations and recovering missing edges. Fine-tuning contextualised graph generation models on the LLM-generated graphs outperforms the models trained on CAEVO-generated data. Experimental results on a human-annotated test set show that the proposed method generates salient and more accurate graphs, outperforming competitive baselines.

翻译：从长文档中生成事件图具有挑战性，因为涉及检测事件、识别其关系以及协调非结构化输入与结构化图等多个任务的固有复杂性。现有研究通常平等看待所有事件，未能区分对于理解叙事至关重要的显著事件。本文提出了CALLMSAE，一个用于显著事件图生成的级联大语言模型框架，该框架利用大语言模型的能力，并消除了对昂贵人工标注的需求。我们首先通过提示大语言模型生成摘要来识别显著事件，并从摘要中提取显著事件。接着，我们开发了一种迭代代码优化提示策略来生成事件关系图，以消除幻觉关系并恢复缺失的边。在大语言模型生成的图上微调情境化图生成模型，其性能优于在CAEVO生成数据上训练的模型。在人工标注测试集上的实验结果表明，所提方法能够生成更显著且更准确的图，优于现有竞争基线。

相关内容

事理图谱

关注 12

事理图谱(Eventic Graph, EG)本质上是一个事理逻辑知识库。事件之间在时间、空间上相继发生的演化规律和模式是一种十分有价值的事理知识，人类依赖对于这类事理知识的深刻理解来指导日常生活实践，改造客观事物。然而，现有的典型知识图谱主要是以实体及其属性和关系为研究核心，缺乏对事理逻辑这一重要人类知识的刻画。为了弥补这一不足，事理图谱应运而生，它能够揭示事件的演化规律和发展逻辑，刻画和记录人类行为活动。事理图谱对于事件预测、意图挖掘、问答系统、人机交互等上层应用都能够起到很好的辅助作用。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日