The dazzling promises of AI systems to augment humans in various tasks hinge on whether humans can appropriately rely on them. Recent research has shown that appropriate reliance is the key to achieving complementary team performance in AI-assisted decision making. This paper addresses an under-explored problem of whether the Dunning-Kruger Effect (DKE) among people can hinder their appropriate reliance on AI systems. DKE is a metacognitive bias due to which less-competent individuals overestimate their own skill and performance. Through an empirical study (N = 249), we explored the impact of DKE on human reliance on an AI system, and whether such effects can be mitigated using a tutorial intervention that reveals the fallibility of AI advice, and exploiting logic units-based explanations to improve user understanding of AI advice. We found that participants who overestimate their performance tend to exhibit under-reliance on AI systems, which hinders optimal team performance. Logic units-based explanations did not help users in either improving the calibration of their competence or facilitating appropriate reliance. While the tutorial intervention was highly effective in helping users calibrate their self-assessment and facilitating appropriate reliance among participants with overestimated self-assessment, we found that it can potentially hurt the appropriate reliance of participants with underestimated self-assessment. Our work has broad implications on the design of methods to tackle user cognitive biases while facilitating appropriate reliance on AI systems. Our findings advance the current understanding of the role of self-assessment in shaping trust and reliance in human-AI decision making. This lays out promising future directions for relevant HCI research in this community.
翻译:AI系统在增强人类各种任务能力方面的诱人前景,取决于人类能否恰当依赖这些系统。近年研究表明,恰当依赖是AI辅助决策中实现互补团队绩效的关键。本文探讨了一个尚待解决的重要问题:邓宁-克鲁格效应(DKE)是否会阻碍人类对AI系统的恰当依赖。DKE是一种元认知偏差,表现为能力不足者高估自身技能与表现。通过一项实证研究(N=249),我们探究了DKE对人类依赖AI系统的影响,以及能否通过两种干预措施缓解这种影响:揭示AI建议可错性的教程干预,以及利用逻辑单元解释改善用户对AI建议的理解。研究发现,高估自身表现的参与者倾向于对AI系统依赖不足,从而阻碍团队最优绩效。逻辑单元解释既未能帮助用户校准自身能力,也未能促进恰当依赖。虽然教程干预在帮助高估自我评估的用户校准自我评估和促进恰当依赖方面效果显著,但我们发现它可能损害低估自我评估用户的恰当依赖。本研究对设计方法以应对用户认知偏差、促进对AI系统的恰当依赖具有广泛启示。我们的发现推进了当前对自我评估在塑造人机决策信任与依赖中作用的认识,为本领域相关人机交互研究指明了有前景的未来方向。