AI, Take the Wheel: What Drives Delegation and Trust in Human-Computer Cooperative Question Answering?

AI systems are fallible, and humans can make mistakes in deciding whether to trust AI over their own judgment. Thus, improving human-AI collaboration requires understanding when, why, and how humans decide to rely on AI. We study two distinct reliance decisions: the delegation choice -- deciding when to let AI act autonomously without knowing its output, and the adoption choice -- evaluating AI suggestions and deciding how to use them. Both of these decoupled reliance patterns shape collaboration, but prior work rarely studies them together in realistic settings with the same users. We address this gap by studying collaborative human--AI teams competing in a question-answering game in which humans can choose when and how to work with AI agents to win. Our 24 matches pair 23 expert humans with 16 AI agents, capturing 387 delegation and 1440 adoption decisions. While human--AI collaboration performs better than either AI or humans alone, humans make suboptimal collaboration decisions, both under-relying on correct AI suggestions (3.9% of opportunities missed) and over-relying when AI misleads them (1.7%). Both parties contribute wrong answers: reported model confidence is near chance when humans and AI disagree, while confirmation bias drives higher under-reliance (64.5%) when an AI suggestion agrees with humans' initial incorrect answer. To close this gap, we recommend calibrated confidence, evidence-grounded explanations, and mechanisms that help users refine trust.

翻译：AI系统并非完美无缺，而人类在决定是否信任AI而非自身判断时也可能犯错。因此，改善人机协作需要理解人类在何时、因何原因以及如何决定依赖AI。我们研究了两种不同的依赖决策：委托选择——在不知道AI输出结果的情况下决定何时让AI自主行动；以及采纳选择——评估AI建议并决定如何使用。这两种解耦的依赖模式共同塑造了协作，但以往研究很少在真实场景中针对同一用户群体同时探讨二者。为填补这一空白，我们通过一项问答游戏研究了协作型人机团队——在该游戏中，人类可以自主选择何时以及如何与AI智能体协作以获胜。我们的24场对局将23位人类专家与16个AI智能体配对，捕捉了387次委托决策和1440次采纳决策。尽管人机协作的表现优于单独使用AI或人类，但人类会做出次优协作决策：既包括对正确AI建议的依赖不足（错失3.9%的机会），也包括当AI误导时过度依赖（1.7%）。双方均会贡献错误答案：当人类与AI意见分歧时，报告模型置信度近乎随机；而当AI建议与人类初始错误答案一致时，确认偏见驱动了更高的依赖不足（64.5%）。为缩小这一差距，我们建议采用校准后的置信度、基于证据的解释，以及帮助用户优化信任的机制。

相关内容

关注 7111

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

《人工智能辅助决策中信任的时间演化》225页

专知会员服务

25+阅读 · 2025年5月12日

《比较人工智能辅助决策与人类辅助决策之间信任的判断和时间演变》最新109页

专知会员服务

43+阅读 · 2024年10月15日

《人工智能辅助决策面临的三大挑战》

专知会员服务

87+阅读 · 2023年12月15日

《信任与人机协作》128页论文

专知会员服务

52+阅读 · 2023年11月22日