Most commonly used non-linear machine learning methods are closed-box models, uninterpretable to humans. The field of explainable artificial intelligence (XAI) aims to develop tools to examine the inner workings of these closed boxes. An often-used model-agnostic approach to XAI involves using simple models as local approximations to produce so-called local explanations; examples of this approach include LIME, SHAP, and SLISEMAP. This paper shows how a large set of local explanations can be reduced to a small "proxy set" of simple models, which can act as a generative global explanation. This reduction procedure, ExplainReduce, can be formulated as an optimisation problem and approximated efficiently using greedy heuristics. We show that, for many problems, as few as five explanations can faithfully emulate the closed-box model and that our reduction procedure is competitive with other model aggregation methods.
翻译:最常用的非线性机器学习方法通常属于封闭式模型,对人类而言难以解释。可解释人工智能领域致力于开发工具以探究这些“黑箱”的内部机制。一种常用的模型无关XAI方法涉及使用简单模型作为局部近似来生成所谓的局部解释;此类方法的实例包括LIME、SHAP和SLISEMAP。本文展示了如何将大量局部解释约简为简单模型的“代理集”,该集合可作为生成式全局解释。这种约简流程——ExplainReduce——可表述为优化问题,并能通过贪心启发式算法高效近似求解。我们证明对于许多问题,仅需五个解释即可忠实模拟封闭式模型,且我们的约简流程与其他模型聚合方法相比具有竞争力。