A central quest in explainable AI relates to understanding the decisions made by (learned) classifiers. There are three dimensions of this understanding that have been receiving significant attention in recent years. The first dimension relates to characterizing conditions on instances that are necessary and sufficient for decisions, therefore providing abstractions of instances that can be viewed as the "reasons behind decisions." The next dimension relates to characterizing minimal conditions that are sufficient for a decision, therefore identifying maximal aspects of the instance that are irrelevant to the decision. The last dimension relates to characterizing minimal conditions that are necessary for a decision, therefore identifying minimal perturbations to the instance that yield alternate decisions. We discuss in this tutorial a comprehensive, semantical and computational theory of explainability along these dimensions which is based on some recent developments in symbolic logic. The tutorial will also discuss how this theory is particularly applicable to non-symbolic classifiers such as those based on Bayesian networks, decision trees, random forests and some types of neural networks.
翻译:可解释人工智能的核心问题之一在于理解(学习型)分类器做出的决策。近年来,这种理解有三个维度受到广泛关注。第一个维度涉及刻画对决策既必要又充分的实例条件,从而提供实例的抽象,这些抽象可被视为“决策背后的原因”。第二个维度涉及刻画对决策充分的最小条件,从而识别实例中与决策无关的最大方面。最后一个维度涉及刻画对决策必要的最小条件,从而识别可产生相反决策的实例的最小扰动。本教程将基于符号逻辑的最新发展,沿着这些维度讨论一种全面的、语义化的且可计算的可解释性理论。该教程还将讨论这一理论如何特别适用于非符号型分类器,例如基于贝叶斯网络、决策树、随机森林和某些类型神经网络的分类器。