Is Ignorance Bliss? The Role of Post Hoc Explanation Faithfulness and Alignment in Model Trust in Laypeople and Domain Experts

Post hoc explanations have emerged as a way to improve user trust in machine learning models by providing insight into model decision-making. However, explanations tend to be evaluated based on their alignment with prior knowledge while the faithfulness of an explanation with respect to the model, a fundamental criterion, is often overlooked. Furthermore, the effect of explanation faithfulness and alignment on user trust and whether this effect differs among laypeople and domain experts is unclear. To investigate these questions, we conduct a user study with computer science students and doctors in three domain areas, controlling the laypeople and domain expert groups in each setting. The results indicate that laypeople base their trust in explanations on explanation faithfulness while domain experts base theirs on explanation alignment. To our knowledge, this work is the first to show that (1) different factors affect laypeople and domain experts' trust in post hoc explanations and (2) domain experts are subject to specific biases due to their expertise when interpreting post hoc explanations. By uncovering this phenomenon and exposing this cognitive bias, this work motivates the need to educate end users about how to properly interpret explanations and overcome their own cognitive biases, and motivates the development of simple and interpretable faithfulness metrics for end users. This research is particularly important and timely as post hoc explanations are increasingly being used in high-stakes, real-world settings such as medicine.

翻译：后验解释通过揭示模型决策过程，已成为提升用户对机器学习模型信任度的一种途径。然而，解释通常依据其与先验知识的一致性进行评估，而解释相对于模型的忠实性这一基本准则却常被忽视。此外，解释的忠实性和一致性对用户信任的影响，以及这种影响在非专业用户和领域专家之间是否存在差异，尚不明确。为探究这些问题，我们以计算机科学专业学生和医生为研究对象，在三个领域场景中控制非专业用户与领域专家分组进行用户研究。结果表明：非专业用户基于解释的忠实性建立信任，而领域专家则基于解释的一致性建立信任。据我们所知，本研究首次揭示了（1）不同因素影响非专业用户与领域专家对后验解释的信任，（2）领域专家在解释后验解释时因其专业知识而产生特定偏见。通过揭示这一现象与认知偏差，本研究强调了教育终端用户如何正确解读解释、克服自身认知偏见的必要性，并推动了面向终端用户的简单可解释忠实性度量指标的发展。鉴于后验解释正越来越多地应用于医疗等高风险现实场景，此项研究具有尤为重要的现实意义与时效性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/