Machine Explanations and Human Understanding

Explanations are hypothesized to improve human understanding of machine learning models and achieve a variety of desirable outcomes, ranging from model debugging to enhancing human decision making. However, empirical studies have found mixed and even negative results. An open question, therefore, is under what conditions explanations can improve human understanding and in what way. Using adapted causal diagrams, we provide a formal characterization of the interplay between machine explanations and human understanding, and show how human intuitions play a central role in enabling human understanding. Specifically, we identify three core concepts of interest that cover all existing quantitative measures of understanding in the context of human-AI decision making: task decision boundary, model decision boundary, and model error. Our key result is that without assumptions about task-specific intuitions, explanations may potentially improve human understanding of model decision boundary, but they cannot improve human understanding of task decision boundary or model error. To achieve complementary human-AI performance, we articulate possible ways on how explanations need to work with human intuitions. For instance, human intuitions about the relevance of features (e.g., education is more important than age in predicting a person's income) can be critical in detecting model error. We validate the importance of human intuitions in shaping the outcome of machine explanations with empirical human-subject studies. Overall, our work provides a general framework along with actionable implications for future algorithmic development and empirical experiments of machine explanations.

翻译：解释被假设能够提升人类对机器学习模型的理解，并实现从模型调试到增强人类决策等一系列理想结果。然而，实证研究发现其结果好坏参半，甚至出现负面效应。因此，一个悬而未决的问题是：在何种条件下，解释能够以何种方式提升人类理解？通过采用经改编的因果图，我们正式刻画了机器解释与人类理解之间的相互作用，并展示了人类直觉在促成人类理解中的核心作用。具体而言，我们识别出三个核心概念，这些概念涵盖了人机协同决策背景下理解的所有现有量化度量：任务决策边界、模型决策边界和模型误差。我们的关键结论是：若缺乏对任务特定直觉的假设，解释可能有助于提升人类对模型决策边界的理解，但无法提升人类对任务决策边界或模型误差的理解。为实现互补性的人机性能，我们阐述了解释需要与人类直觉协同作用的可能路径。例如，人类关于特征相关性的直觉（如预测个人收入时，教育程度比年龄更重要）在检测模型误差中可能至关重要。我们通过以人类为受试者的实证研究验证了人类直觉在塑造机器解释结果中的重要性。总体而言，我们的工作提供了一个通用框架，并为未来机器学习解释的算法开发和实验研究提供了可操作性的启示。