Explainability methods are often challenging to evaluate and compare. With a multitude of explainers available, practitioners must often compare and select explainers based on quantitative evaluation metrics. One particular differentiator between explainers is the diversity of explanations for a given dataset; i.e. whether all explanations are identical, unique and uniformly distributed, or somewhere between these two extremes. In this work, we define a complexity measure for explainers, globalness, which enables deeper understanding of the distribution of explanations produced by feature attribution and feature selection methods for a given dataset. We establish the axiomatic properties that any such measure should possess and prove that our proposed measure, Wasserstein Globalness, meets these criteria. We validate the utility of Wasserstein Globalness using image, tabular, and synthetic datasets, empirically showing that it both facilitates meaningful comparison between explainers and improves the selection process for explainability methods.
翻译:可解释性方法通常难以评估和比较。面对众多可用的解释器,从业者往往需要基于定量评估指标来比较和选择解释器。解释器之间的一个关键区别在于对给定数据集生成解释的多样性;即所有解释是完全相同、独特且均匀分布,还是介于这两个极端之间。本文为解释器定义了一种复杂性度量——全局性,该度量能够深入理解针对给定数据集的特征归因和特征选择方法所生成解释的分布规律。我们建立了此类度量应满足的公理性质,并证明我们提出的度量——Wasserstein全局性——符合这些标准。我们利用图像、表格和合成数据集验证了Wasserstein全局性的实用性,实证表明该度量既能促进解释器之间的有效比较,也能改进可解释性方法的选择流程。