This article presents a pipeline for automated fact-checking leveraging publicly available Language Models and data. The objective is to assess the accuracy of textual claims using evidence from a ground-truth evidence corpus. The pipeline consists of two main modules -- the evidence retrieval and the claim veracity evaluation. Our primary focus is on the ease of deployment in various languages that remain unexplored in the field of automated fact-checking. Unlike most similar pipelines, which work with evidence sentences, our pipeline processes data on a paragraph level, simplifying the overall architecture and data requirements. Given the high cost of annotating language-specific fact-checking training data, our solution builds on the Question Answering for Claim Generation (QACG) method, which we adapt and use to generate the data for all models of the pipeline. Our strategy enables the introduction of new languages through machine translation of only two fixed datasets of moderate size. Subsequently, any number of training samples can be generated based on an evidence corpus in the target language. We provide open access to all data and fine-tuned models for Czech, English, Polish, and Slovak pipelines, as well as to our codebase that may be used to reproduce the results.We comprehensively evaluate the pipelines for all four languages, including human annotations and per-sample difficulty assessment using Pointwise V-information. The presented experiments are based on full Wikipedia snapshots to promote reproducibility. To facilitate implementation and user interaction, we develop the FactSearch application featuring the proposed pipeline and the preliminary feedback on its performance.
翻译:本文提出了一种利用公开可用的语言模型和数据进行自动化事实核查的流水线。其目标是通过来自真实证据语料库的证据,评估文本声明的准确性。该流水线包含两个主要模块:证据检索和声明真实性评估。我们的核心关注点在于,如何便捷地部署到自动化事实核查领域尚未充分探索的多种语言中。与大多数基于句子级证据运行的同类流水线不同,我们的流水线在段落层面处理数据,从而简化了整体架构和数据需求。鉴于标注特定语言的事实核查训练数据成本高昂,我们的解决方案基于问答式声明生成方法(QACG),并对其进行调整,用于生成流水线中所有模型所需的数据。我们的策略使得仅通过机器翻译两个规模适中的固定数据集,即可引入新语言。随后,可基于目标语言的证据语料库生成任意数量的训练样本。我们公开提供了捷克语、英语、波兰语和斯洛伐克语的流水线所有数据和微调模型,以及可用于复现结果的代码库。我们对所有四种语言的流水线进行了全面评估,包括人工标注和使用逐点V信息(Pointwise V-information)进行的样本难度评估。为促进可复现性,实验基于完整的维基百科快照进行。为便于实现和用户交互,我们开发了集成所提流水线的FactSearch应用程序,并提供了其性能的初步反馈。