Automated decision-making systems are becoming increasingly ubiquitous, which creates an immediate need for their interpretability and explainability. However, it remains unclear whether users know what insights an explanation offers and, more importantly, what information it lacks. To answer this question we conducted an online study with 200 participants, which allowed us to assess explainees' ability to realise explicated information -- i.e., factual insights conveyed by an explanation -- and unspecified information -- i.e, insights that are not communicated by an explanation -- across four representative explanation types: model architecture, decision surface visualisation, counterfactual explainability and feature importance. Our findings uncover that highly comprehensible explanations, e.g., feature importance and decision surface visualisation, are exceptionally susceptible to misinterpretation since users tend to infer spurious information that is outside of the scope of these explanations. Additionally, while the users gauge their confidence accurately with respect to the information explicated by these explanations, they tend to be overconfident when misinterpreting the explanations. Our work demonstrates that human comprehension can be a double-edged sword since highly accessible explanations may convince users of their truthfulness while possibly leading to various misinterpretations at the same time. Machine learning explanations should therefore carefully navigate the complex relation between their full scope and limitations to maximise understanding and curb misinterpretation.
翻译:自动化决策系统正变得日益普及,这对其可解释性与可说明性提出了迫切需求。然而,用户是否清楚解释能提供何种见解,更重要的是,解释缺乏哪些信息,目前仍不明确。为回答这一问题,我们开展了一项包含200名参与者的在线研究,评估了解释接收者在四种代表性解释类型(模型架构、决策面可视化、反事实可解释性与特征重要性)中,对已阐明信息(即解释所传达的事实性见解)与未明确信息(即解释未传达的见解)的认知能力。研究发现,高可理解性解释(如特征重要性与决策面可视化)极易被误解,因为用户倾向于推断超出这些解释范围的虚假信息。此外,尽管用户能准确评估自身对这些解释所阐明信息的置信度,但在误解解释时往往表现出过度自信。本研究表明,人类理解能力可能成为一把双刃剑:高度易懂的解释虽能使用户信服其真实性,却也可能同时引发多种误解。因此,机器学习解释需审慎把握其完整范围与局限性之间的复杂关系,以最大化理解效果并抑制误解产生。