Explainable Artificial Intelligence (XAI) aims to improve the transparency of autonomous decision-making through explanations. Recent literature has emphasised users' need for holistic "multi-shot" explanations and the ability to personalise their engagement with XAI systems. We refer to this user-centred interaction as an XAI Experience. Despite advances in creating XAI experiences, evaluating them in a user-centred manner has remained challenging. To address this, we introduce the XAI Experience Quality (XEQ) Scale (pronounced "Seek" Scale), for evaluating the user-centred quality of XAI experiences. Furthermore, XEQ quantifies the quality of experiences across four evaluation dimensions: learning, utility, fulfilment and engagement. These contributions extend the state-of-the-art of XAI evaluation, moving beyond the one-dimensional metrics frequently developed to assess single-shot explanations. In this paper, we present the XEQ scale development and validation process, including content validation with XAI experts as well as discriminant and construct validation through a large-scale pilot study. Out pilot study results offer strong evidence that establishes the XEQ Scale as a comprehensive framework for evaluating user-centred XAI experiences.
翻译:可解释人工智能(XAI)旨在通过解释提升自主决策的透明度。近期研究强调用户对整体性"多轮次"解释的需求,以及个性化参与XAI系统的能力。我们将这种以用户为中心的交互称为XAI体验。尽管在创建XAI体验方面已取得进展,但以用户为中心的方式评估这些体验仍具挑战性。为此,我们提出了XAI体验质量(XEQ)量表(发音同"Seek"),用于评估以用户为中心的XAI体验质量。该量表通过四个评估维度量化体验质量:学习性、实用性、满足感与参与度。这些贡献拓展了XAI评估的前沿研究,超越了当前常用于评估单轮次解释的一维指标。本文阐述了XEQ量表的开发与验证过程,包括与XAI专家的内容效度验证,以及通过大规模试点研究进行的区分效度与结构效度验证。试点研究结果提供了有力证据,表明XEQ量表可作为评估以用户为中心的XAI体验的综合性框架。