Evidence retrieval is a core part of automatic fact-checking. Prior work makes simplifying assumptions in retrieval that depart from real-world use cases: either no access to evidence, access to evidence curated by a human fact-checker, or access to evidence available long after the claim has been made. In this work, we present the first fully automated pipeline to check real-world claims by retrieving raw evidence from the web. We restrict our retriever to only search documents available prior to the claim's making, modeling the realistic scenario where an emerging claim needs to be checked. Our pipeline includes five components: claim decomposition, raw document retrieval, fine-grained evidence retrieval, claim-focused summarization, and veracity judgment. We conduct experiments on complex political claims in the ClaimDecomp dataset and show that the aggregated evidence produced by our pipeline improves veracity judgments. Human evaluation finds the evidence summary produced by our system is reliable (it does not hallucinate information) and relevant to answering key questions about a claim, suggesting that it can assist fact-checkers even when it cannot surface a complete evidence set.
翻译:证据检索是自动化事实核查的核心环节。以往研究在检索过程中做出简化假设,与真实应用场景存在差距:要么无法获取证据,要么依赖人工核查员筛选的证据,要么仅能获取声明发布后长期存在的证据。本文提出了首个完全自动化的流水线,通过从网络检索原始证据来核查真实世界的声明。我们将检索范围限制在声明发布前可获取的文档,模拟新兴声明需要被核查的现实场景。该流水线包含五个组件:声明分解、原始文档检索、细粒度证据检索、声明聚焦式摘要生成及真实性判断。我们在ClaimDecomp数据集的复杂政治声明上开展实验,结果表明本流水线生成的聚合证据能提升真实性判断的准确性。人工评估发现,本系统生成的证据摘要具有可靠性(不产生信息幻觉)且与回答声明关键问题高度相关,这表明即便无法提供完整证据集,该工具仍能辅助事实核查员工作。