Many existing interpretation methods are based on Partial Dependence (PD) functions that, for a pre-trained machine learning model, capture how a subset of the features affects the predictions by averaging over the remaining features. Notable methods include Shapley additive explanations (SHAP) which computes feature contributions based on a game theoretical interpretation and PD plots (i.e., 1-dim PD functions) that capture average marginal main effects. Recent work has connected these approaches using a functional decomposition and argues that SHAP values can be misleading since they merge main and interaction effects into a single local effect. A major advantage of SHAP compared to other PD-based interpretations, however, has been the availability of fast estimation techniques, such as \texttt{TreeSHAP}. In this paper, we propose a new tree-based estimator, \texttt{FastPD}, which efficiently estimates arbitrary PD functions. We show that \texttt{FastPD} consistently estimates the desired population quantity -- in contrast to path-dependent \texttt{TreeSHAP} which is inconsistent when features are correlated. For moderately deep trees, \texttt{FastPD} improves the complexity of existing methods from quadratic to linear in the number of observations. By estimating PD functions for arbitrary feature subsets, \texttt{FastPD} can be used to extract PD-based interpretations such as SHAP, PD plots and higher order interaction effects.
翻译:许多现有的可解释性方法基于偏依赖(PD)函数,对于一个预训练的机器学习模型,PD函数通过平均化其余特征来捕捉特征子集如何影响预测。值得注意的方法包括基于博弈论解释计算特征贡献的Shapley加性解释(SHAP),以及捕捉平均边际主效应的PD图(即一维PD函数)。近期研究通过函数分解将这些方法联系起来,并指出SHAP值可能具有误导性,因为它们将主效应和交互效应合并为单一的局部效应。然而,与其他基于PD的解释方法相比,SHAP的主要优势在于存在快速估计技术,例如 \texttt{TreeSHAP}。本文提出了一种新的基于树的估计器 \texttt{FastPD},它能高效估计任意PD函数。我们证明 \texttt{FastPD} 能够一致地估计目标总体量——这与路径依赖的 \texttt{TreeSHAP} 形成对比,后者在特征相关时会出现不一致性。对于中等深度的树,\texttt{FastPD} 将现有方法的复杂度从观测数量的二次方降低至线性。通过估计任意特征子集的PD函数,\texttt{FastPD} 可用于提取基于PD的解释,如SHAP、PD图以及高阶交互效应。