Explaining decisions of black-box classifiers is both important and computationally challenging. In this paper, we scrutinize explainers that generate feature-based explanations from samples or datasets. We start by presenting a set of desirable properties that explainers would ideally satisfy, delve into their relationships, and highlight incompatibilities of some of them. We identify the entire family of explainers that satisfy two key properties which are compatible with all the others. Its instances provide sufficient reasons, called weak abductive explanations.We then unravel its various subfamilies that satisfy subsets of compatible properties. Indeed, we fully characterize all the explainers that satisfy any subset of compatible properties. In particular, we introduce the first (broad family of) explainers that guarantee the existence of explanations and their global consistency.We discuss some of its instances including the irrefutable explainer and the surrogate explainer whose explanations can be found in polynomial time.
翻译:解释黑盒分类器的决策既重要又具有计算挑战性。本文深入研究了从样本或数据集生成基于特征的解释的解释器。我们首先提出了一组理想情况下解释器应满足的期望性质,探讨了这些性质之间的相互关系,并指出了其中某些性质之间的不兼容性。我们识别出了满足两个关键性质(且与所有其他性质兼容)的整个解释器家族。该家族的实例提供了充分的理由,称为弱溯因解释。随后,我们揭示了满足兼容性质子集的各种子家族。事实上,我们完整刻画了满足任何兼容性质子集的所有解释器。特别地,我们引入了首个(广泛的)保证解释存在性及其全局一致性的解释器家族。我们讨论了其中的一些实例,包括不可辩驳解释器和替代解释器,它们的解释可以在多项式时间内找到。