Explainable AI (XAI) has been proposed as a valuable tool to assist in downstream tasks involving human and AI collaboration. Perhaps the most psychologically valid XAI techniques are case based approaches which display 'whole' exemplars to explain the predictions of black box AI systems. However, for such post hoc XAI methods dealing with images, there has been no attempt to improve their scope by using multiple clear feature 'parts' of the images to explain the predictions while linking back to relevant cases in the training data, thus allowing for more comprehensive explanations that are faithful to the underlying model. Here, we address this gap by proposing two general algorithms (latent and super pixel based) which can isolate multiple clear feature parts in a test image, and then connect them to the explanatory cases found in the training data, before testing their effectiveness in a carefully designed user study. Results demonstrate that the proposed approach appropriately calibrates a users feelings of 'correctness' for ambiguous classifications in real world data on the ImageNet dataset, an effect which does not happen when just showing the explanation without feature highlighting.
翻译:可解释人工智能(XAI)被提出作为辅助人类与AI协作下游任务的重要工具。其中最具心理效度的XAI技术当属案例方法,通过展示"完整"样例来解释黑箱AI系统的预测。然而,对于处理图像的事后XAI方法,目前尚未出现通过利用图像中多个清晰的特征"局部"来解释预测结果,同时关联回训练数据中相关案例的改进尝试。这种方法既能提供更全面的解释,又能确保对底层模型的忠实性。为填补该空白,本文提出两种通用算法(基于潜在特征和超像素),可分离测试图像中的多个清晰特征局部,并将其关联至训练数据中的解释性案例,最后通过精心设计的用户研究验证其有效性。实验结果表明,在ImageNet数据集上的真实数据分类任务中,所提方法能恰当校准用户对模糊分类结果的"正确性"感知——这种效果在仅展示无特征突出的解释时无法实现。