The future of human-centric eXplainable Artificial Intelligence (XAI) is not post-hoc explanations

Explainable Artificial Intelligence (XAI) plays a crucial role in enabling human understanding and trust in deep learning systems, often defined as determining which features are most important to a model's prediction. As models get larger, more ubiquitous, and pervasive in aspects of daily life, explainability is necessary to avoid or minimize adverse effects of model mistakes. Unfortunately, current approaches in human-centric XAI (e.g. predictive tasks in healthcare, education, or personalized ads) tend to rely on a single explainer. This is a particularly concerning trend when considering that recent work has identified systematic disagreement in explainability methods when applied to the same points and underlying black-box models. In this paper, we therefore present a call for action to address the limitations of current state-of-the-art explainers. We propose to shift from post-hoc explainability to designing interpretable neural network architectures; moving away from approximation techniques in human-centric and high impact applications. We identify five needs of human-centric XAI (real-time, accurate, actionable, human-interpretable, and consistent) and propose two schemes for interpretable-by-design neural network workflows (adaptive routing for interpretable conditional computation and diagnostic benchmarks for iterative model learning). We postulate that the future of human-centric XAI is neither in explaining black-boxes nor in reverting to traditional, interpretable models, but in neural networks that are intrinsically interpretable.

翻译：可解释人工智能（XAI）在帮助人类理解和信任深度学习系统方面发挥着关键作用，通常被定义为确定对模型预测最重要的特征。随着模型规模日益庞大、在日常生活中的应用更加普及和广泛，可解释性对于避免或减轻模型错误的负面影响至关重要。不幸的是，当前以人为中心的XAI方法（例如医疗、教育或个性化广告中的预测任务）往往依赖于单一解释器。这一趋势尤其令人担忧，因为近期研究已发现，当针对相同数据点和底层黑箱模型应用可解释性方法时，这些方法之间表现出系统性分歧。因此，本文提出行动倡议，旨在解决当前最先进解释器的局限性。我们建议从事后可解释性转向设计可解释的神经网络架构，从而在以人为中心和高影响力应用中摒弃近似技术。我们识别出以人为中心的XAI的五项需求（实时性、准确性、可操作性、人类可解释性和一致性），并提出两种可解释性设计的神经网络工作流方案（基于自适应路由的可解释条件计算和用于迭代模型学习的诊断基准）。我们假设，以人为中心的XAI的未来既不在于解释黑箱，也不在于回归传统可解释模型，而在于本质上具有可解释性的神经网络。