Misinformation undermines public trust in science and democracy, particularly on social media where inaccuracies can spread rapidly. Experts and laypeople have shown to be effective in correcting misinformation by manually identifying and explaining inaccuracies. Nevertheless, this approach is difficult to scale, a concern as technologies like large language models (LLMs) make misinformation easier to produce. LLMs also have versatile capabilities that could accelerate misinformation correction; however, they struggle due to a lack of recent information, a tendency to produce plausible but false content and references, and limitations in addressing multimodal information. To address these issues, we propose MUSE, an LLM augmented with access to and credibility evaluation of up-to-date information. By retrieving contextual evidence and refutations, MUSE can provide accurate and trustworthy explanations and references. It also describes visuals and conducts multimodal searches for correcting multimodal misinformation. We recruit fact-checking and journalism experts to evaluate corrections to real social media posts across 13 dimensions, ranging from the factuality of explanation to the relevance of references. The results demonstrate MUSE's ability to correct misinformation promptly after appearing on social media; overall, MUSE outperforms GPT-4 by 37% and even high-quality corrections from laypeople by 29%. This work underscores the potential of LLMs to combat real-world misinformation effectively and efficiently.
翻译:错误信息削弱了公众对科学和民主的信任,尤其是在社交媒体上,不准确信息可能迅速传播。专家和普通民众已证明通过手动识别和解释不准确之处能有效纠正错误信息。然而,这种方法难以规模化,尤其当大型语言模型等技术使错误信息更容易产生时,这一问题更为突出。大型语言模型也具备加速错误信息纠正的多功能能力;但由于缺乏最新信息、容易生成看似合理实则虚假的内容和引用,以及在处理多模态信息方面存在局限,其应用受到制约。为解决这些问题,我们提出MUSE模型,该模型增强了大型语言模型对最新信息的访问和可信度评估能力。通过检索上下文证据和反驳论据,MUSE能提供准确可信的解释和引用。它还能描述视觉内容并进行多模态搜索以纠正多模态错误信息。我们邀请事实核查和新闻学专家从13个维度(包括解释的准确性到引用的相关性)对真实社交媒体帖子的纠正效果进行评估。结果表明,MUSE能在错误信息出现在社交媒体后迅速予以纠正;总体而言,MUSE的表现比GPT-4高出37%,甚至比普通人的高质量纠正高出29%。这项工作凸显了大型语言模型在有效且高效地对抗现实世界错误信息方面的潜力。