EXplainable Artificial Intelligence (XAI) is a vibrant research topic in the artificial intelligence community, with growing interest across methods and domains. Much has been written about the subject, yet XAI still lacks shared terminology and a framework capable of providing structural soundness to explanations. In our work, we address these issues by proposing a novel definition of explanation that is a synthesis of what can be found in the literature. We recognize that explanations are not atomic but the combination of evidence stemming from the model and its input-output mapping, and the human interpretation of this evidence. Furthermore, we fit explanations into the properties of faithfulness (i.e., the explanation being a true description of the model's inner workings and decision-making process) and plausibility (i.e., how much the explanation looks convincing to the user). Using our proposed theoretical framework simplifies how these properties are operationalized and it provides new insight into common explanation methods that we analyze as case studies.
翻译:可解释人工智能(XAI)是人工智能领域中一个充满活力的研究课题,其关注点正跨越不同方法和领域日益增长。尽管关于该主题已有大量文献,但XAI仍缺乏共享术语以及能够为解释提供结构严谨性的框架。在我们的工作中,我们通过提出一种文献综述融合而来的全新解释定义来应对这些问题。我们认识到解释并非原子化的,而是源于模型及其输入-输出映射的证据与人类对此证据的解读相结合的结果。此外,我们将解释纳入忠实性(即解释真实描述模型内部运作和决策过程)与合理性(即解释对用户而言看起来令人信服的程度)这两个属性中。采用我们提出的理论框架简化了这些属性的可操作性,并为作为案例研究分析的常见解释方法提供了新见解。