Explainable AI (XAI) is a rapidly evolving field that aims to improve transparency and trustworthiness of AI systems to humans. One of the unsolved challenges in XAI is estimating the performance of these explanation methods for neural networks, which has resulted in numerous competing metrics with little to no indication of which one is to be preferred. In this paper, to identify the most reliable evaluation method in a given explainability context, we propose MetaQuantus -- a simple yet powerful framework that meta-evaluates two complementary performance characteristics of an evaluation method: its resilience to noise and reactivity to randomness. We demonstrate the effectiveness of our framework through a series of experiments, targeting various open questions in XAI, such as the selection of explanation methods and optimisation of hyperparameters of a given metric. We release our work under an open-source license to serve as a development tool for XAI researchers and Machine Learning (ML) practitioners to verify and benchmark newly constructed metrics (i.e., ``estimators'' of explanation quality). With this work, we provide clear and theoretically-grounded guidance for building reliable evaluation methods, thus facilitating standardisation and reproducibility in the field of XAI.
翻译:可解释人工智能(XAI)是一个快速发展的领域,旨在提升人工智能系统对人类的透明度和可信度。XAI领域尚未解决的挑战之一是评估这些神经网络解释方法的性能,这导致了大量竞争性指标的产生,却几乎未指明应优先选择哪个指标。本文提出MetaQuantus——一个简单而强大的框架,通过元评估评估方法的两类互补性能特征:其对噪声的鲁棒性和对随机性的反应性,从而在特定可解释性背景下识别最可靠的评估方法。我们通过一系列针对XAI中多个开放问题(如解释方法选择及给定指标超参数优化)的实验,证明了该框架的有效性。我们以开源许可证发布该工作,作为XAI研究人员和机器学习从业者的开发工具,用于验证和基准测试新构建的指标(即解释质量的"评估器")。本研究为构建可靠评估方法提供了清晰且具理论基础的指导,从而促进XAI领域的标准化与可复现性。