Shapley values are extensively used in explainable artificial intelligence (XAI) as a framework to explain predictions made by complex machine learning (ML) models. In this work, we focus on conditional Shapley values for predictive models fitted to tabular data and explain the prediction $f(\boldsymbol{x}^{*})$ for a single observation $\boldsymbol{x}^{*}$ at the time. Numerous Shapley value estimation methods have been proposed and empirically compared on an average basis in the XAI literature. However, less focus has been devoted to analyzing the precision of the Shapley value explanations on an individual basis. We extend our work in Olsen et al. (2023) by demonstrating and discussing that the explanations are systematically less precise for observations on the outer region of the training data distribution for all used estimation methods. This is expected from a statistical point of view, but to the best of our knowledge, it has not been systematically addressed in the Shapley value literature. This is crucial knowledge for Shapley values practitioners, who should be more careful in applying these observations' corresponding Shapley value explanations.
翻译:Shapley值在可解释人工智能(XAI)中作为解释复杂机器学习(ML)模型预测的框架被广泛使用。本文聚焦于针对表格数据拟合的预测模型的条件Shapley值,并解释单个观测值$\boldsymbol{x}^{*}$对应的预测$f(\boldsymbol{x}^{*})$。XAI文献中已提出大量Shapley值估计方法,并在平均基础上进行了实证比较。然而,针对Shapley值解释在个体层面的精度分析,相关研究尚不充分。我们扩展了Olsen等人(2023)的工作,通过实验证明并讨论:对于所有使用的估计方法,位于训练数据分布外缘区域的观测值其解释精度系统性较低。从统计学角度看,这一现象符合预期,但据我们所知,Shapley值相关文献中尚未对这一问题进行系统性论述。这一发现对Shapley值的实践者至关重要——他们需要在对这些观测值应用对应的Shapley值解释时更加审慎。