Automated Claim Matching with Large Language Models: Empowering Fact-Checkers in the Fight Against Misinformation

In today's digital era, the rapid spread of misinformation poses threats to public well-being and societal trust. As online misinformation proliferates, manual verification by fact checkers becomes increasingly challenging. We introduce FACT-GPT (Fact-checking Augmentation with Claim matching Task-oriented Generative Pre-trained Transformer), a framework designed to automate the claim matching phase of fact-checking using Large Language Models (LLMs). This framework identifies new social media content that either supports or contradicts claims previously debunked by fact-checkers. Our approach employs GPT-4 to generate a labeled dataset consisting of simulated social media posts. This data set serves as a training ground for fine-tuning more specialized LLMs. We evaluated FACT-GPT on an extensive dataset of social media content related to public health. The results indicate that our fine-tuned LLMs rival the performance of larger pre-trained LLMs in claim matching tasks, aligning closely with human annotations. This study achieves three key milestones: it provides an automated framework for enhanced fact-checking; demonstrates the potential of LLMs to complement human expertise; offers public resources, including datasets and models, to further research and applications in the fact-checking domain.

翻译：在当今数字时代，虚假信息的快速传播对公共福祉和社会信任构成威胁。随着在线虚假信息的泛滥，事实核查人员的人工验证日益困难。我们提出了FACT-GPT（基于声明匹配任务导向型生成式预训练Transformer的事实核查增强框架），该框架旨在利用大语言模型实现事实核查中声明匹配阶段的自动化。该框架能识别出支持或反驳事实核查人员先前已辟谣声明的新社交媒体内容。我们的方法使用GPT-4生成包含模拟社交媒体帖子的标注数据集，该数据集作为微调更专业化大语言模型的训练基础。我们在与公共健康相关的大量社交媒体内容数据集上评估了FACT-GPT。结果表明，经过微调的大语言模型在声明匹配任务上可与更大的预训练模型相媲美，且与人工标注高度一致。本研究实现了三个关键里程碑：提供了用于增强事实核查的自动化框架；展现了大语言模型补充人类专业知识的潜力；公开了数据集和模型等资源，以推动事实核查领域的进一步研究与应用。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日