Salient facts of sociopolitical events are distilled into quadruples following a format of subject, relation, object, and timestamp. Machine learning methods, such as graph neural networks (GNNs) and recurrent neural networks (RNNs), have been built to make predictions and infer relations on the quadruple-based knowledge graphs (KGs). In many applications, quadruples are extended to quintuples with auxiliary attributes such as text summaries that describe the quadruple events. In this paper, we comprehensively investigate how large language models (LLMs) streamline the design of event prediction frameworks using quadruple-based or quintuple-based data while maintaining competitive accuracy. We propose LEAP, a unified framework that leverages large language models as event predictors. Specifically, we develop multiple prompt templates to frame the object prediction (OP) task as a standard question-answering (QA) task, suitable for instruction fine-tuning with an encoder-decoder LLM. For multi-event forecasting (MEF) task, we design a simple yet effective prompt template for each event quintuple. This novel approach removes the need for GNNs and RNNs, instead utilizing an encoder-only LLM to generate fixed intermediate embeddings, which are processed by a customized downstream head with a self-attention mechanism to predict potential relation occurrences in the future. Extensive experiments on multiple real-world datasets using various evaluation metrics validate the effectiveness of our approach.
翻译:社会政治事件的显著事实被提炼为遵循主体、关系、客体和时间戳格式的四元组。诸如图神经网络(GNNs)和循环神经网络(RNNs)等机器学习方法已被构建用于在基于四元组的知识图谱(KGs)上进行预测和关系推断。在许多应用中,四元组被扩展为五元组,其包含描述四元组事件的文本摘要等辅助属性。本文全面研究了大语言模型(LLMs)如何在使用基于四元组或五元组数据的同时,保持有竞争力的准确性,从而简化事件预测框架的设计。我们提出了LEAP,一个利用大语言模型作为事件预测器的统一框架。具体而言,我们开发了多种提示模板,将客体预测(OP)任务构建为标准问答(QA)任务,适用于使用编码器-解码器LLM进行指令微调。对于多事件预测(MEF)任务,我们为每个事件五元组设计了一个简单而有效的提示模板。这种新颖方法无需GNNs和RNNs,而是利用一个仅编码器LLM生成固定的中间嵌入表示,这些表示由一个定制的、具有自注意力机制的下游头部处理,以预测未来潜在的关系发生。在多个真实世界数据集上使用各种评估指标进行的广泛实验验证了我们方法的有效性。