The last decade witnessed an ever-increasing stream of successes in Machine Learning (ML). These successes offer clear evidence that ML is bound to become pervasive in a wide range of practical uses, including many that directly affect humans. Unfortunately, the operation of the most successful ML models is incomprehensible for human decision makers. As a result, the use of ML models, especially in high-risk and safety-critical settings is not without concern. In recent years, there have been efforts on devising approaches for explaining ML models. Most of these efforts have focused on so-called model-agnostic approaches. However, all model-agnostic and related approaches offer no guarantees of rigor, hence being referred to as non-formal. For example, such non-formal explanations can be consistent with different predictions, which renders them useless in practice. This paper overviews the ongoing research efforts on computing rigorous model-based explanations of ML models; these being referred to as formal explanations. These efforts encompass a variety of topics, that include the actual definitions of explanations, the characterization of the complexity of computing explanations, the currently best logical encodings for reasoning about different ML models, and also how to make explanations interpretable for human decision makers, among others.
翻译:过去十年,机器学习领域取得了一系列持续突破的成功。这些成功明确表明,机器学习将广泛应用于众多实际场景,包括许多直接影响人类生活的领域。然而,最成功的机器学习模型的运作方式对人类决策者而言难以理解。因此,特别是在高风险和安全关键环境中使用机器学习模型并非毫无顾虑。近年来,人们致力于设计解释机器学习模型的方法,其中大部分工作聚焦于所谓的模型无关方法。但所有模型无关及相关方法都无法保证严谨性,因而被称为非形式化方法。例如,这类非形式化解释可能与不同预测结果一致,这使其在实践中毫无价值。本文概述了当前关于计算机器学习模型严谨的基于模型解释的研究工作——这类解释被称为形式化解释。这些工作涵盖多个主题,包括解释的实际定义、计算解释复杂性的表征、当前用于推理不同机器学习模型的最优逻辑编码方法,以及如何使解释对人类决策者更具可理解性等。