Automated fact-checking (AFC) is garnering increasing attention by researchers aiming to help fact-checkers combat the increasing spread of misinformation online. While many existing AFC methods incorporate external information from the Web to help examine the veracity of claims, they often overlook the importance of verifying the source and quality of collected "evidence". One overlooked challenge involves the reliance on "leaked evidence", information gathered directly from fact-checking websites and used to train AFC systems, resulting in an unrealistic setting for early misinformation detection. Similarly, the inclusion of information from unreliable sources can undermine the effectiveness of AFC systems. To address these challenges, we present a comprehensive approach to evidence verification and filtering. We create the "CREDible, Unreliable or LEaked" (CREDULE) dataset, which consists of 91,632 articles classified as Credible, Unreliable and Fact checked (Leaked). Additionally, we introduce the EVidence VERification Network (EVVER-Net), trained on CREDULE to detect leaked and unreliable evidence in both short and long texts. EVVER-Net can be used to filter evidence collected from the Web, thus enhancing the robustness of end-to-end AFC systems. We experiment with various language models and show that EVVER-Net can demonstrate impressive performance of up to 91.5% and 94.4% accuracy, while leveraging domain credibility scores along with short or long texts, respectively. Finally, we assess the evidence provided by widely-used fact-checking datasets including LIAR-PLUS, MOCHEG, FACTIFY, NewsCLIPpings+ and VERITE, some of which exhibit concerning rates of leaked and unreliable evidence.
翻译:自动化事实核查(AFC)正日益受到研究人员的关注,旨在帮助事实核查人员应对网络上日益泛滥的虚假信息。尽管许多现有AFC方法会整合网络外部信息来帮助检验声明的真实性,但它们往往忽略了验证所收集"证据"来源与质量的重要性。一个被忽视的挑战涉及对"泄露证据"的依赖——这些信息直接从事实核查网站获取并用于训练AFC系统,导致早期虚假信息检测处于不切实际的环境。同样,包含来自不可信来源的信息也会削弱AFC系统的有效性。为应对这些挑战,我们提出了一种全面的证据验证与过滤方法。我们创建了"可信、不可信或泄露"(CREDULE)数据集,包含91,632篇被分类为可信、不可信和已核查(泄露)的文章。此外,我们引入了证据验证网络(EVVER-Net),该网络基于CREDULE训练,能够检测短文本和长文本中的泄露与不可信证据。EVVER-Net可用于过滤从网络收集的证据,从而增强端到端AFC系统的鲁棒性。我们使用多种语言模型进行了实验,结果表明EVVER-Net在分别利用短文本或长文本结合领域可信度评分时,能达到高达91.5%和94.4%的准确率。最后,我们评估了广泛使用的事实核查数据集(包括LIAR-PLUS、MOCHEG、FACTIFY、NewsCLIPpings+和VERITE)所提供的证据,其中一些数据集呈现了令人担忧的泄露与不可信证据比例。