Explaining decisions of black-box classifiers is both important and computationally challenging. In this paper, we scrutinize explainers that generate feature-based explanations from samples or datasets. We start by presenting a set of desirable properties that explainers would ideally satisfy, delve into their relationships, and highlight incompatibilities of some of them. We identify the entire family of explainers that satisfy two key properties which are compatible with all the others. Its instances provide sufficient reasons, called weak abductive explanations.We then unravel its various subfamilies that satisfy subsets of compatible properties. Indeed, we fully characterize all the explainers that satisfy any subset of compatible properties. In particular, we introduce the first (broad family of) explainers that guarantee the existence of explanations and their global consistency.We discuss some of its instances including the irrefutable explainer and the surrogate explainer whose explanations can be found in polynomial time.
翻译:解释黑盒分类器的决策既重要又具有计算挑战性。本文深入研究了从样本或数据集中生成基于特征的解释的解释器。我们首先提出了一组理想情况下解释器应满足的期望性质,深入探讨了它们之间的关系,并指出了其中某些性质之间的不相容性。我们识别出了满足两个关键性质(且与所有其他性质相容)的完整解释器族。该族的实例提供被称为弱溯因解释的充分理由。随后,我们揭示了满足相容性质子集的各种子族。事实上,我们完全刻画了满足任意相容性质子集的所有解释器。特别地,我们引入了首个(广泛族类的)保证解释存在性及其全局一致性的解释器。我们讨论了其中的一些实例,包括不可辩驳解释器和替代解释器,它们的解释可以在多项式时间内找到。