Machine Explanations and Human Understanding

Explanations are hypothesized to improve human understanding of machine learning models and achieve a variety of desirable outcomes, ranging from model debugging to enhancing human decision making. However, empirical studies have found mixed and even negative results. An open question, therefore, is under what conditions explanations can improve human understanding and in what way. Using adapted causal diagrams, we provide a formal characterization of the interplay between machine explanations and human understanding, and show how human intuitions play a central role in enabling human understanding. Specifically, we identify three core concepts of interest that cover all existing quantitative measures of understanding in the context of human-AI decision making: task decision boundary, model decision boundary, and model error. Our key result is that without assumptions about task-specific intuitions, explanations may potentially improve human understanding of model decision boundary, but they cannot improve human understanding of task decision boundary or model error. To achieve complementary human-AI performance, we articulate possible ways on how explanations need to work with human intuitions. For instance, human intuitions about the relevance of features (e.g., education is more important than age in predicting a person's income) can be critical in detecting model error. We validate the importance of human intuitions in shaping the outcome of machine explanations with empirical human-subject studies. Overall, our work provides a general framework along with actionable implications for future algorithmic development and empirical experiments of machine explanations.

翻译：解释被假设能够提升人类对机器学习模型的理解，并实现从模型调试到增强人类决策等一系列理想结果。然而，实证研究发现了混合甚至负面的结果。因此，一个悬而未决的问题是：在何种条件下，解释能够以何种方式提升人类理解？通过采用因果图框架，我们形式化地刻画了机器解释与人类理解之间的相互作用，并揭示了人类直觉在促成人类理解中的核心作用。具体而言，我们识别出涵盖人类-AI决策背景下所有现有理解定量度量的三个核心概念：任务决策边界、模型决策边界和模型错误。我们的关键结论是：若缺乏对任务特定直觉的假设，解释可能提升人类对模型决策边界的理解，但无法提升对任务决策边界或模型错误的理解。为实现人类与AI的互补性能，我们阐明了解释需要与人类直觉协同作用的可能路径。例如，人类关于特征相关性的直觉（如预测个人收入时教育比年龄更重要）在检测模型错误中可能至关重要。我们通过实证人类受试者研究验证了人类直觉在塑造机器解释结果中的重要性。总体而言，我们的工作为未来机器解释的算法开发与实证实验提供了一个通用框架及可操作启示。