This paper develops a rigorous argument for why the use of Shapley values in explainable AI (XAI) will necessarily yield provably misleading information about the relative importance of features for predictions. Concretely, this paper demonstrates that there exist classifiers, and associated predictions, for which the relative importance of features determined by the Shapley values will incorrectly assign more importance to features that are provably irrelevant for the prediction, and less importance to features that are provably relevant for the prediction. The paper also argues that, given recent complexity results, the existence of efficient algorithms for the computation of rigorous feature attribution values in the case of some restricted classes of classifiers should be deemed unlikely at best.
翻译:本文提出了一个严谨论证,说明在可解释人工智能(XAI)中使用Shapley值必然会产生关于特征对预测相对重要性的可证明误导性信息。具体而言,本文证明存在某些分类器及其对应预测,其中由Shapley值确定的特征相对重要性将错误地赋予可证明与预测无关的特征更高重要性,而赋予可证明与预测相关的特征较低重要性。本文还指出,鉴于近期复杂性理论结果,即便对于某些受限分类器类别,计算严格特征归因值的高效算法的存在性充其量也只能被视为不太可能。