In a human-AI collaboration, users build a mental model of the AI system based on its reliability and how it presents its decision, e.g. its presentation of system confidence and an explanation of the output. Modern NLP systems are often uncalibrated, resulting in confidently incorrect predictions that undermine user trust. In order to build trustworthy AI, we must understand how user trust is developed and how it can be regained after potential trust-eroding events. We study the evolution of user trust in response to these trust-eroding events using a betting game. We find that even a few incorrect instances with inaccurate confidence estimates damage user trust and performance, with very slow recovery. We also show that this degradation in trust reduces the success of human-AI collaboration and that different types of miscalibration -- unconfidently correct and confidently incorrect -- have different negative effects on user trust. Our findings highlight the importance of calibration in user-facing AI applications and shed light on what aspects help users decide whether to trust the AI system.
翻译:在人机协作中,用户基于AI系统的可靠性及其决策呈现方式(例如系统置信度表达和输出解释)构建心理模型。现代自然语言处理系统往往校准不足,导致出现自信错误的预测,从而削弱用户信任。为构建可信赖的AI,我们必须理解用户信任如何形成,以及在潜在的信任侵蚀事件发生后如何重建信任。我们通过一个博弈游戏研究用户信任随信任侵蚀事件的演变过程。研究发现,即使少数错误实例伴随不准确的置信度估计,也会损害用户信任与协作表现,且恢复速度极为缓慢。我们还证明,这种信任退化会降低人机协作的成功率,并且不同类型的校准偏差——不自信的正确预测与自信的错误预测——对用户信任具有不同的负面影响。我们的发现凸显了校准在面向用户的AI应用中的重要性,并揭示了哪些因素有助于用户决定是否信任AI系统。