Fact checking aims to predict claim veracity by reasoning over multiple evidence pieces. It usually involves evidence retrieval and veracity reasoning. In this paper, we focus on the latter, reasoning over unstructured text and structured table information. Previous works have primarily relied on fine-tuning pretrained language models or training homogeneous-graph-based models. Despite their effectiveness, we argue that they fail to explore the rich semantic information underlying the evidence with different structures. To address this, we propose a novel word-level Heterogeneous-graph-based model for Fact Checking over unstructured and structured information, namely HeterFC. Our approach leverages a heterogeneous evidence graph, with words as nodes and thoughtfully designed edges representing different evidence properties. We perform information propagation via a relational graph neural network, facilitating interactions between claims and evidence. An attention-based method is utilized to integrate information, combined with a language model for generating predictions. We introduce a multitask loss function to account for potential inaccuracies in evidence retrieval. Comprehensive experiments on the large fact checking dataset FEVEROUS demonstrate the effectiveness of HeterFC. Code will be released at: https://github.com/Deno-V/HeterFC.
翻译:事实核查旨在通过多证据推理预测声明真实性,通常涉及证据检索与真实性推理两个环节。本文聚焦于对非结构化文本与结构化表格信息的推理过程。现有研究主要依赖预训练语言模型微调或同构图模型训练,但本文认为这些方法未能充分挖掘不同结构证据中蕴含的丰富语义信息。为此,我们提出了一种基于词级异构图的新型事实核查模型HeterFC,用于处理非结构化与结构化信息。该方法构建了异构证据图,以词语为节点并精心设计体现不同证据属性的边,通过关系图神经网络实现信息传播以促进声明与证据的交互。我们采用注意力机制进行信息整合,结合语言模型生成预测结果,并引入多任务损失函数以应对证据检索中的潜在误差。在大型事实核查数据集FEVEROUS上的综合实验验证了HeterFC的有效性。代码将发布于:https://github.com/Deno-V/HeterFC。