A central goal of eXplainable Artificial Intelligence (XAI) is to assign relative importance to the features of a Machine Learning (ML) model given some prediction. The importance of this task of explainability by feature attribution is illustrated by the ubiquitous recent use of tools such as SHAP and LIME. Unfortunately, the exact computation of feature attributions, using the game-theoretical foundation underlying SHAP and LIME, can yield manifestly unsatisfactory results, that tantamount to reporting misleading relative feature importance. Recent work targeted rigorous feature attribution, by studying axiomatic aggregations of features based on logic-based definitions of explanations by feature selection. This paper shows that there is an essential relationship between feature attribution and a priori voting power, and that those recently proposed axiomatic aggregations represent a few instantiations of the range of power indices studied in the past. Furthermore, it remains unclear how some of the most widely used power indices might be exploited as feature importance scores (FISs), i.e. the use of power indices in XAI, and which of these indices would be the best suited for the purposes of XAI by feature attribution, namely in terms of not producing results that could be deemed as unsatisfactory. This paper proposes novel desirable properties that FISs should exhibit. In addition, the paper also proposes novel FISs exhibiting the proposed properties. Finally, the paper conducts a rigorous analysis of the best-known power indices in terms of the proposed properties.
翻译:可解释人工智能(XAI)的核心目标是为机器学习(ML)模型的某一预测结果分配各特征的相对重要性。特征归因这一可解释性任务的重要性,通过近年来SHAP和LIME等工具的广泛使用可见一斑。然而,基于SHAP和LIME所依赖的博弈论基础精确计算特征归因,可能产生明显不满意的结果,这些结果等同于报告误导性的相对特征重要性。近期工作通过基于逻辑定义的按特征选择解释,研究特征的公理化聚合,致力于实现严谨的特征归因。本文表明,特征归因与先验投票权之间存在本质联系,而近期提出的那些公理化聚合仅为过去研究中各类权力指数范围的少数实例。此外,当前尚不明确一些最广泛使用的权力指数如何被用作特征重要性分数(FISs,即权力指数在XAI中的应用),以及这些指数中哪些最适合用于特征归因的XAI目的——尤其是在不产生可能被视为不满意的结果方面。本文提出了FISs应具备的新颖理想属性,同时提出了具有这些属性的新型FISs。最后,本文基于所提出的属性,对最知名的权力指数进行了严谨分析。