Explaining artificial intelligence or machine learning models is an increasingly important problem. For humans to stay in the loop and control such systems, we must be able to understand how they interact with the world. This work proposes using known or assumed causal structure in the input variables to produce simple and practical explanations of supervised learning models. Our explanations -- which we name Causal Dependence Plots or CDP -- visualize how the model output depends on changes in a given predictor \emph{along with any consequent causal changes in other predictors}. Since this causal dependence captures how humans often think about input-output dependence, CDPs can be powerful tools in the explainable AI or interpretable ML toolkit and contribute to applications including scientific machine learning and algorithmic fairness. CDP can also be used for model-agnostic or black-box explanations.
翻译:解释人工智能或机器学习模型正成为一个日益重要的问题。为使人类能够参与监督并控制此类系统,我们必须理解它们如何与世界交互。本文提出利用输入变量中已知或假定的因果结构,为监督学习模型提供简洁实用的解释。我们将这些解释命名为因果依赖图(Causal Dependence Plots,简称CDP),其可视化方式展示了模型输出如何随给定预测变量的变化,以及由此引发的其他预测变量中的因果连锁变化。由于这种因果依赖关系契合了人类通常对输入-输出依赖的认知方式,CDP可成为可解释人工智能或可解释机器学习工具库中的有力工具,并助力于科学机器学习与算法公平性等应用场景。此外,CDP还适用于模型无关或黑箱解释场景。