As Machine Learning (ML) models achieve unprecedented levels of performance, the XAI domain aims at making these models understandable by presenting end-users with intelligible explanations. Yet, some existing XAI approaches fail to meet expectations: several issues have been reported in the literature, generally pointing out either technical limitations or misinterpretations by users. In this paper, we argue that the resulting harms arise from a complex overlap of multiple failures in XAI, which existing ad-hoc studies fail to capture. This work therefore advocates for a holistic perspective, presenting a systematic investigation of limitations of current XAI methods and their impact on the interpretation of explanations. By distinguishing between system-specific and user-specific failures, we propose a typological framework that helps revealing the nuanced complexities of explanation failures. Leveraging this typology, we also discuss some research directions to help AI practitioners better understand the limitations of XAI systems and enhance the quality of ML explanations.
翻译:随着机器学习模型达到前所未有的性能水平,可解释人工智能领域致力于通过向终端用户提供可理解的解释,使这些模型变得透明。然而,现有的一些XAI方法未能达到预期效果:文献中已报告了若干问题,通常指出技术局限性或用户的误解。本文认为,由此产生的危害源于XAI中多种失败模式的复杂重叠,这是现有特定案例研究未能捕捉到的。因此,本研究倡导采用整体性视角,系统性地探究当前XAI方法的局限性及其对解释解读的影响。通过区分系统特定失败与用户特定失败,我们提出了一个类型学框架,有助于揭示解释失败背后的微妙复杂性。基于此类型学,我们还讨论了若干研究方向,以帮助人工智能从业者更好地理解XAI系统的局限性,并提升机器学习解释的质量。