Deceptive AI systems that give explanations are more convincing than honest AI systems and can amplify belief in misinformation

Advanced Artificial Intelligence (AI) systems, specifically large language models (LLMs), have the capability to generate not just misinformation, but also deceptive explanations that can justify and propagate false information and erode trust in the truth. We examined the impact of deceptive AI generated explanations on individuals' beliefs in a pre-registered online experiment with 23,840 observations from 1,192 participants. We found that in addition to being more persuasive than accurate and honest explanations, AI-generated deceptive explanations can significantly amplify belief in false news headlines and undermine true ones as compared to AI systems that simply classify the headline incorrectly as being true/false. Moreover, our results show that personal factors such as cognitive reflection and trust in AI do not necessarily protect individuals from these effects caused by deceptive AI generated explanations. Instead, our results show that the logical validity of AI generated deceptive explanations, that is whether the explanation has a causal effect on the truthfulness of the AI's classification, plays a critical role in countering their persuasiveness - with logically invalid explanations being deemed less credible. This underscores the importance of teaching logical reasoning and critical thinking skills to identify logically invalid arguments, fostering greater resilience against advanced AI-driven misinformation.

翻译：先进的人工智能（AI）系统，特别是大语言模型（LLM），不仅能够生成错误信息，还能产生欺骗性解释，这些解释可以为虚假信息提供理由并助其传播，从而削弱对真相的信任。我们通过一项预注册的在线实验（包含来自1,192名参与者的23,840次观察）检验了AI生成的欺骗性解释对个体信念的影响。研究发现，与准确且诚实的解释相比，AI生成的欺骗性解释不仅更具说服力，而且相较于仅将新闻标题错误分类为真/假的AI系统，它能显著加剧对虚假新闻标题的信任并削弱对真实新闻标题的信任。此外，我们的结果表明，认知反思和对AI的信任等个人因素未必能保护个体免受这些由AI生成的欺骗性解释所造成的影响。相反，我们的研究发现，AI生成的欺骗性解释的逻辑有效性——即解释是否对AI分类的真实性具有因果影响——在抵消其说服力方面起着关键作用：逻辑无效的解释被认为可信度较低。这凸显了教授逻辑推理和批判性思维技能以识别逻辑无效论证的重要性，从而增强对先进AI驱动的错误信息的抵御能力。

相关内容

关注 7103

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日