Convolutional Neural Networks (CNNs) are frequently and successfully used in medical prediction tasks. They are often used in combination with transfer learning, leading to improved performance when training data for the task are scarce. The resulting models are highly complex and typically do not provide any insight into their predictive mechanisms, motivating the field of 'explainable' artificial intelligence (XAI). However, previous studies have rarely quantitatively evaluated the 'explanation performance' of XAI methods against ground-truth data, and transfer learning and its influence on objective measures of explanation performance has not been investigated. Here, we propose a benchmark dataset that allows for quantifying explanation performance in a realistic magnetic resonance imaging (MRI) classification task. We employ this benchmark to understand the influence of transfer learning on the quality of explanations. Experimental results show that popular XAI methods applied to the same underlying model differ vastly in performance, even when considering only correctly classified examples. We further observe that explanation performance strongly depends on the task used for pre-training and the number of CNN layers pre-trained. These results hold after correcting for a substantial correlation between explanation and classification performance.
翻译:卷积神经网络(CNNs)在医学预测任务中得到了广泛应用且效果显著。在训练数据不足时,它们常与迁移学习结合使用,从而提升任务性能。然而,这类模型高度复杂,通常无法提供预测机制的可解释性,由此催生了"可解释人工智能"(XAI)领域。然而,现有研究很少基于真实标注数据对XAI方法的"解释性能"进行定量评估,且迁移学习对解释性能客观指标的影响尚未被探究。本文提出一个基准数据集,可在真实磁共振成像(MRI)分类任务中量化解释性能。我们利用该基准探究迁移学习对解释质量的影响。实验结果表明,即便仅考虑正确分类的样本,应用于相同基础模型的主流XAI方法在性能上仍存在显著差异。此外,我们观察到解释性能强烈依赖于预训练任务类型及预训练的CNN层数。在修正解释性能与分类性能之间的显著相关性后,上述结论依然成立。