Statistical Inference and Learning for Shapley Additive Explanations (SHAP)

The SHAP (short for Shapley additive explanation) framework has become an essential tool for attributing importance to variables in predictive tasks. In model-agnostic settings, SHAP uses the concept of Shapley values from cooperative game theory to fairly allocate credit to the features in a vector $X$ based on their contribution to an outcome $Y$. While the explanations offered by SHAP are local by nature, learners often need global measures of feature importance in order to improve model explainability and perform feature selection. The most common approach for converting these local explanations into global ones is to compute either the mean absolute SHAP or mean squared SHAP. However, despite their ubiquity, there do not exist approaches for performing statistical inference on these quantities. In this paper, we take a semi-parametric approach for calibrating confidence in estimates of the $p$th powers of Shapley additive explanations. We show that, by treating the SHAP curve as a nuisance function that must be estimated from data, one can reliably construct asymptotically normal estimates of the $p$th powers of SHAP. When $p \geq 2$, we show a de-biased estimator that combines U-statistics with Neyman orthogonal scores for functionals of nested regressions is asymptotically normal. When $1 \leq p < 2$ (and the hence target parameter is not twice differentiable), we construct de-biased U-statistics for a smoothed alternative. In particular, we show how to carefully tune the temperature parameter of the smoothing function in order to obtain inference for the true, unsmoothed $p$th power. We complement these results by presenting a Neyman orthogonal loss that can be used to learn the SHAP curve via empirical risk minimization and discussing excess risk guarantees for commonly used function classes.

翻译：SHAP（沙普利加性解释的缩写）框架已成为预测任务中变量重要性归因的重要工具。在模型无关的设置中，SHAP利用合作博弈论中的沙普利值概念，根据特征向量$X$中各特征对结果$Y$的贡献度公平分配其影响权重。尽管SHAP提供的解释本质上是局部性的，但学习者通常需要全局性的特征重要性度量，以提升模型可解释性并执行特征选择。将这些局部解释转化为全局解释的最常用方法是计算平均绝对SHAP值或均方SHAP值。然而，尽管这些方法应用广泛，目前仍缺乏对这些量进行统计推断的成熟方法。本文采用半参数方法对沙普利加性解释的$p$次幂估计值进行置信度校准。研究表明，通过将SHAP曲线视为需从数据中估计的冗余函数，可以可靠地构建具有渐近正态性的$p$次幂SHAP估计量。当$p \geq 2$时，我们证明结合U统计量与嵌套回归泛函的Neyman正交得分的去偏估计量具有渐近正态性。当$1 \leq p < 2$时（此时目标参数不可二次微分），我们为平滑替代量构建去偏U统计量。特别地，我们展示了如何精细调节平滑函数的温度参数，从而对真实的非平滑$p$次幂进行统计推断。此外，我们提出了一种可用于通过经验风险最小化学习SHAP曲线的Neyman正交损失函数，并讨论了常用函数类的超额风险保证，以完善上述理论结果。