Show me the evidence: Evaluating the role of evidence and natural language explanations in AI-supported fact-checking

Although much research has focused on AI explanations to support decisions in complex information-seeking tasks such as fact-checking, the role of evidence is surprisingly under-researched. In our study, we systematically varied explanation type, AI prediction certainty, and correctness of AI system advice for non-expert participants, who evaluated the veracity of claims and AI system predictions. Participants were provided the option of easily inspecting the underlying evidence. We found that participants consistently relied on evidence to validate AI claims across all experimental conditions. When participants were presented with natural language explanations, evidence was used less frequently although they relied on it when these explanations seemed insufficient or flawed. Qualitative data suggests that participants attempted to infer evidence source reliability, despite source identities being deliberately omitted. Our results demonstrate that evidence is a key ingredient in how people evaluate the reliability of information presented by an AI system and, in combination with natural language explanations, offers valuable support for decision-making. Further research is urgently needed to understand how evidence ought to be presented and how people engage with it in practice.

翻译：尽管大量研究聚焦于AI解释如何支持复杂信息检索任务（如事实核查）中的决策，但证据的作用却令人惊讶地缺乏深入研究。在我们的研究中，我们系统性地操纵了解释类型、AI预测确定性以及AI系统建议的正确性，让非专业参与者评估声明的真实性及AI系统预测。参与者可选择便捷地查看底层证据。我们发现，在所有实验条件下，参与者始终依赖证据来验证AI声明。当参与者获得自然语言解释时，证据使用频率降低，但当这些解释显得不足或有缺陷时，他们仍会依赖证据。定性数据表明，尽管来源身份被刻意隐藏，参与者仍试图推断证据来源的可靠性。我们的研究结果表明，证据是人们评估AI系统所呈现信息可靠性的关键要素，并且与自然语言解释相结合，能为决策提供有价值的支持。亟需进一步研究以理解证据应如何呈现以及人们在实践中如何与之互动。

相关内容

关注 7103

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

【牛津大学博士论文】迈向可信 AI：从局部可解释性到因果理解

专知会员服务

34+阅读 · 2025年9月16日

《人工智能辅助决策中信任的时间演化》225页

专知会员服务

24+阅读 · 2025年5月12日

人机编队《NLP中人工智能决策的解释效用评价》49页长综述

专知会员服务

29+阅读 · 2025年1月8日

机器学习可解释如何客观评估？CMU-Yeh博士论文《可解释机器学习的客观标准》，148页pdf

专知会员服务

79+阅读 · 2022年11月23日