In today's critical domains, the predominance of black-box machine learning models amplifies the demand for Explainable AI (XAI). The widely used SHAP values, while quantifying feature importance, are often too intricate and lack human-friendly explanations. Furthermore, counterfactual (CF) explanations present `what ifs' but leave users grappling with the 'why'. To bridge this gap, we introduce XAIstories. Leveraging Large Language Models, XAIstories provide narratives that shed light on AI predictions: SHAPstories do so based on SHAP explanations to explain a prediction score, while CFstories do so for CF explanations to explain a decision. Our results are striking: over 90% of the surveyed general audience finds the narrative generated by SHAPstories convincing. Data scientists primarily see the value of SHAPstories in communicating explanations to a general audience, with 92% of data scientists indicating that it will contribute to the ease and confidence of nonspecialists in understanding AI predictions. Additionally, 83% of data scientists indicate they are likely to use SHAPstories for this purpose. In image classification, CFstories are considered more or equally convincing as users own crafted stories by over 75% of lay user participants. CFstories also bring a tenfold speed gain in creating a narrative, and improves accuracy by over 20% compared to manually created narratives. The results thereby suggest that XAIstories may provide the missing link in truly explaining and understanding AI predictions.
翻译:在当今关键领域,黑箱机器学习模型的普遍存在加剧了对可解释人工智能的需求。广泛使用的SHAP值虽然能够量化特征重要性,但往往过于复杂且缺乏人性化的解释。此外,反事实解释提供了"如果...会怎样"的假设,却让用户难以理解"为什么"。为弥合这一差距,我们提出XAIstories方法。通过利用大语言模型,XAIstories提供揭示AI预测逻辑的叙事文本:SHAPstories基于SHAP解释来阐明预测得分,而CFstories则针对反事实解释说明决策依据。研究结果令人瞩目:超过90%的受访普通受众认为SHAPstories生成的叙事具有说服力。数据科学家主要看重SHAPstories向普通受众传达解释的价值,其中92%的受访数据科学家表示这将帮助非专业人士更轻松、更自信地理解AI预测。此外,83%的数据科学家表示倾向使用SHAPstories实现此目的。在图像分类任务中,超过75%的普通用户参与者认为CFstories生成的叙事与其自行构建的故事同样或更具说服力。相较于手动创建叙事,CFstories还能带来十倍的叙事生成速度提升,并使准确率提高20%以上。这些结果表明,XAIstories可能为真正解释和理解AI预测提供了关键缺失环节。