While the emerging research field of explainable artificial intelligence (XAI) claims to address the lack of explainability in high-performance machine learning models, in practice, XAI targets developers rather than actual end-users. Unsurprisingly, end-users are often unwilling to use XAI-based decision support systems. Similarly, there is limited interdisciplinary research on end-users' behavior during XAI explanations usage, rendering it unknown how explanations may impact cognitive load and further affect end-user performance. Therefore, we conducted an empirical study with 271 prospective physicians, measuring their cognitive load, task performance, and task time for distinct implementation-independent XAI explanation types using a COVID-19 use case. We found that these explanation types strongly influence end-users' cognitive load, task performance, and task time. Further, we contextualized a mental efficiency metric, ranking local XAI explanation types best, to provide recommendations for future applications and implications for sociotechnical XAI research.
翻译:尽管新兴的可解释人工智能(XAI)研究领域声称能解决高性能机器学习模型缺乏可解释性的问题,但在实践中,XAI主要面向开发者而非实际终端用户。毫不意外的是,终端用户往往不愿使用基于XAI的决策支持系统。同样,关于终端用户在使用XAI解释过程中的行为跨学科研究也十分有限,导致我们尚不清楚解释如何影响认知负荷并进一步影响终端用户表现。为此,我们以新冠肺炎(COVID-19)用例为背景,对271名准医生进行了实证研究,测量了他们在使用不同实现无关的XAI解释类型时的认知负荷、任务表现及任务耗时。研究发现,这些解释类型对终端用户的认知负荷、任务表现及任务耗时具有显著影响。此外,我们构建了心理效率指标,并发现局部XAI解释类型表现最优,从而为未来应用及社会技术型XAI研究提供了建议。