Convolutional Neural Networks (CNNs) are frequently and successfully used in medical prediction tasks. They are often used in combination with transfer learning, leading to improved performance when training data for the task are scarce. The resulting models are highly complex and typically do not provide any insight into their predictive mechanisms, motivating the field of "explainable" artificial intelligence (XAI). However, previous studies have rarely quantitatively evaluated the "explanation performance" of XAI methods against ground-truth data, and transfer learning and its influence on objective measures of explanation performance has not been investigated. Here, we propose a benchmark dataset that allows for quantifying explanation performance in a realistic magnetic resonance imaging (MRI) classification task. We employ this benchmark to understand the influence of transfer learning on the quality of explanations. Experimental results show that popular XAI methods applied to the same underlying model differ vastly in performance, even when considering only correctly classified examples. We further observe that explanation performance strongly depends on the task used for pre-training and the number of CNN layers pre-trained. These results hold after correcting for a substantial correlation between explanation and classification performance.
翻译:卷积神经网络(CNN)在医学预测任务中被频繁且成功地应用。它们常与迁移学习结合使用,当任务训练数据稀缺时能有效提升性能。由此产生的模型高度复杂,通常无法揭示其预测机制,这推动了"可解释"人工智能(XAI)领域的发展。然而,先前研究很少基于真实数据对XAI方法的"解释性能"进行定量评估,且迁移学习及其对解释性能客观指标的影响尚未得到深入研究。本文提出一个基准数据集,可在真实的磁共振成像(MRI)分类任务中量化解释性能。我们利用该基准探究迁移学习对解释质量的影响。实验结果表明,即使仅考虑正确分类的样本,应用于相同基础模型的流行XAI方法在性能上仍存在巨大差异。我们进一步发现解释性能高度依赖于预训练任务类型和CNN预训练层数。这些结论在修正解释性能与分类性能之间的显著相关性后依然成立。