In explainable machine learning, local post-hoc explanation algorithms and inherently interpretable models are often seen as competing approaches. This work offers a partial reconciliation between the two by establishing a correspondence between Shapley Values and Generalized Additive Models (GAMs). We introduce $n$-Shapley Values, a parametric family of local post-hoc explanation algorithms that explain individual predictions with interaction terms up to order $n$. By varying the parameter $n$, we obtain a sequence of explanations that covers the entire range from Shapley Values up to a uniquely determined decomposition of the function we want to explain. The relationship between $n$-Shapley Values and this decomposition offers a functionally-grounded characterization of Shapley Values, which highlights their limitations. We then show that $n$-Shapley Values, as well as the Shapley Taylor- and Faith-Shap interaction indices, recover GAMs with interaction terms up to order $n$. This implies that the original Shapely Values recover GAMs without variable interactions. Taken together, our results provide a precise characterization of Shapley Values as they are being used in explainable machine learning. They also offer a principled interpretation of partial dependence plots of Shapley Values in terms of the underlying functional decomposition. A package for the estimation of different interaction indices is available at \url{https://github.com/tml-tuebingen/nshap}.
翻译:在可解释机器学习中,局部事后解释算法与本质可解释模型常被视为相互竞争的方法。本文通过建立Shapley值与广义加性模型(GAMs)之间的对应关系,为两者提供了部分调和。我们引入了$n$-Shapley值——一类参数化的局部事后解释算法,能通过至多$n$阶的交互项解释个体预测。通过改变参数$n$,我们获得一系列解释,覆盖从Shapley值到待解释函数唯一确定分解的完整范围。$n$-Shapley值与这一分解之间的关系,为Shapley值提供了基于函数特性的刻画,并凸显了其局限性。我们进一步证明,$n$-Shapley值以及Shapley Taylor和Faith-Shap交互指标,能够恢复具有至多$n$阶交互项的GAMs。这意味着原始Shapley值可恢复无变量交互的GAMs。综合来看,我们的结果为Shapley值在可解释机器学习中的应用提供了精确刻画,同时基于底层函数分解,为Shapley值的部分依赖图提供了原则性解释。不同交互指标的估计工具包已发布于\url{https://github.com/tml-tuebingen/nshap}。