The use of machine learning (ML) in critical domains such as medicine poses risks and requires regulation. One requirement is that decisions of ML systems in high-risk applications should be human-understandable. The field of "explainable artificial intelligence" (XAI) seemingly addresses this need. However, in its current form, XAI is unfit to provide quality control for ML; it itself needs scrutiny. Popular XAI methods cannot reliably answer important questions about ML models, their training data, or a given test input. We recapitulate results demonstrating that popular XAI methods systematically attribute importance to input features that are independent of the prediction target. This limits their utility for purposes such as model and data (in)validation, model improvement, and scientific discovery. We argue that the fundamental reason for this limitation is that current XAI methods do not address well-defined problems and are not evaluated against objective criteria of explanation correctness. Researchers should formally define the problems they intend to solve first and then design methods accordingly. This will lead to notions of explanation correctness that can be theoretically verified and objective metrics of explanation performance that can be assessed using ground-truth data.
翻译:机器学习(ML)在医疗等关键领域的应用存在风险并需要监管。一项基本要求是,高风险应用中ML系统的决策应能为人类所理解。"可解释人工智能"(XAI)领域看似满足了这一需求。然而,当前形式的XAI并不适合为ML提供质量控制;其自身也需要接受审视。流行的XAI方法无法可靠地回答关于ML模型、其训练数据或给定测试输入的重要问题。我们重新梳理了相关研究结果,证明流行的XAI方法会系统性地将重要性归因于与预测目标无关的输入特征。这限制了其在模型与数据(无效)验证、模型改进及科学发现等用途上的有效性。我们认为,这一局限的根本原因在于当前XAI方法未能明确定义待解决的问题,且缺乏基于客观解释正确性标准的评估。研究者应首先形式化定义拟解决的问题,再据此设计方法。这将催生可通过理论验证的解释正确性概念,以及能利用真实数据评估解释性能的客观度量标准。