Considerations When Learning Additive Explanations for Black-Box Models

from arxiv, Published at Machine Learning (2023). Previously titled "Learning Global Additive Explanations for Neural Nets Using Model Distillation". A short version was presented at NeurIPS 2018 Machine Learning for Health Workshop

Many methods to explain black-box models, whether local or global, are additive. In this paper, we study global additive explanations for non-additive models, focusing on four explanation methods: partial dependence, Shapley explanations adapted to a global setting, distilled additive explanations, and gradient-based explanations. We show that different explanation methods characterize non-additive components in a black-box model's prediction function in different ways. We use the concepts of main and total effects to anchor additive explanations, and quantitatively evaluate additive and non-additive explanations. Even though distilled explanations are generally the most accurate additive explanations, non-additive explanations such as tree explanations that explicitly model non-additive components tend to be even more accurate. Despite this, our user study showed that machine learning practitioners were better able to leverage additive explanations for various tasks. These considerations should be taken into account when considering which explanation to trust and use to explain black-box models.

翻译：许多解释黑盒模型的方法，无论是局部还是全局的，都具有可加性。本文研究了非可加模型的全局可加性解释，重点关注四种解释方法：部分依赖图、适应全局设置的沙普利解释、蒸馏可加性解释以及基于梯度的解释。我们表明，不同的解释方法以不同方式刻画黑盒模型预测函数中的非可加成分。我们使用主效应和总效应的概念来锚定可加性解释，并定量评估可加性与非可加性解释。尽管蒸馏解释通常是最准确的可加性解释，但明确建模非可加成分的非可加性解释（例如树解释）往往更为准确。尽管如此，我们的用户研究表明，机器学习从业者能更好地利用可加性解释完成各种任务。在选择信任并使用哪种解释来解释黑盒模型时，应充分考虑这些因素。

相关内容

黑盒

关注 1

在科学，计算和工程学中，黑盒是一种设备，系统或对象，可以根据其输入和输出（或传输特性）对其进行查看，而无需对其内部工作有任何了解。它的实现是“不透明的”（黑色）。几乎任何事物都可以被称为黑盒：晶体管，引擎，算法，人脑，机构或政府。为了使用典型的“黑匣子方法”来分析建模为开放系统的事物，仅考虑刺激/响应的行为，以推断（未知）盒子。该黑匣子系统的通常表示形式是在该方框中居中的数据流程图。黑盒的对立面是一个内部组件或逻辑可用于检查的系统，通常将其称为白盒（有时也称为“透明盒”或“玻璃盒”）。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日