Explainable Artificial Intelligence (XAI) aims to improve the transparency of autonomous decision-making through explanations. Recent literature has emphasised users' need for holistic "multi-shot" explanations and the ability to personalise their engagement with XAI systems. We refer to this user-centred interaction as an XAI Experience. Despite advances in creating XAI experiences, evaluating them in a user-centred manner has remained challenging. To address this, we introduce the XAI Experience Quality (XEQ) Scale (pronounced "Seek" Scale), for evaluating the user-centred quality of XAI experiences. Furthermore, XEQ quantifies the quality of experiences across four evaluation dimensions: learning, utility, fulfilment and engagement. These contributions extend the state-of-the-art of XAI evaluation, moving beyond the one-dimensional metrics frequently developed to assess single-shot explanations. In this paper, we present the XEQ scale development and validation process, including content validation with XAI experts as well as discriminant and construct validation through a large-scale pilot study. Out pilot study results offer strong evidence that establishes the XEQ Scale as a comprehensive framework for evaluating user-centred XAI experiences.
翻译:可解释人工智能(XAI)旨在通过解释来提高自主决策的透明度。近期文献强调用户对整体性“多轮次”解释的需求,以及个性化参与XAI系统的能力。我们将这种以用户为中心的交互称为XAI体验。尽管在创建XAI体验方面取得了进展,但以用户为中心的方式评估这些体验仍然具有挑战性。为此,我们提出了XAI体验质量(XEQ)量表(发音为“Seek”量表),用于评估以用户为中心的XAI体验质量。此外,XEQ从四个评估维度量化体验质量:学习性、实用性、满足感和参与度。这些贡献拓展了XAI评估的前沿,超越了常被用于评估单轮次解释的一维指标。本文介绍了XEQ量表的开发与验证过程,包括与XAI专家的内容验证,以及通过大规模试点研究进行的区分效度与结构效度验证。我们的试点研究结果为确立XEQ量表作为评估以用户为中心的XAI体验的综合框架提供了有力证据。