Saliency maps can explain how deep neural networks classify images. But are they actually useful for humans? The present systematic review of 68 user studies found that while saliency maps can enhance human performance, null effects or even costs are quite common. To investigate what modulates these effects, the empirical outcomes were organised along several factors related to the human tasks, AI performance, XAI methods, images to be classified, human participants and comparison conditions. In image-focused tasks, benefits were less common than in AI-focused tasks, but the effects depended on the specific cognitive requirements. Moreover, benefits were usually restricted to incorrect AI predictions in AI-focused tasks but to correct ones in image-focused tasks. XAI-related factors had surprisingly little impact. The evidence was limited for image- and human-related factors and the effects were highly dependent on the comparison conditions. These findings may support the design of future user studies.
翻译:显著性图能够解释深度神经网络如何对图像进行分类。然而,它们对人类用户是否真正有用?本系统性综述对68项用户研究进行了分析,发现尽管显著性图可以提升人类表现,但零效应甚至负面代价也相当普遍。为探究调节这些效应的因素,研究结果根据与人类任务、人工智能表现、可解释人工智能方法、待分类图像、人类参与者及比较条件相关的多个维度进行了系统梳理。在图像聚焦型任务中,其益处比人工智能聚焦型任务中更少见,但具体效应取决于认知需求特征。此外,在人工智能聚焦型任务中,益处通常仅限于人工智能预测错误的情况;而在图像聚焦型任务中,则局限于预测正确的情境。令人意外的是,可解释人工智能相关因素的影响微乎其微。图像与人类相关因素的证据较为有限,且效应高度依赖于比较条件。这些发现可为未来用户研究的设计提供支撑。