Despite progress in automated fact-checking, most systems require a significant amount of labeled training data, which is expensive. In this paper, we propose a novel zero-shot method, which instead of operating directly on the claim and evidence sentences, decomposes them into semantic triples augmented using external knowledge graphs, and uses large language models trained for natural language inference. This allows it to generalize to adversarial datasets and domains that supervised models require specific training data for. Our empirical results show that our approach outperforms previous zero-shot approaches on FEVER, FEVER-Symmetric, FEVER 2.0, and Climate-FEVER, while being comparable or better than supervised models on the adversarial and the out-of-domain datasets.
翻译:尽管自动事实核查已取得进展,但大多数系统需要大量人工标注的训练数据,成本高昂。本文提出一种新颖的零样本方法:该方法不直接处理声明及证据语句,而是将其分解为通过外部知识图谱增强的语义三元组,并利用经过自然语言推理训练的大语言模型。这使得模型能够泛化到需要监督模型特定训练数据的对抗性数据集与跨领域数据集。实验结果表明,我们的方法在FEVER、FEVER-Symmetric、FEVER 2.0和Climate-FEVER数据集上优于先前零样本方法,同时在对抗性及域外数据集上达到或超越监督模型的表现。