Despite Convolutional Neural Networks having reached human-level performance in some medical tasks, their clinical use has been hindered by their lack of interpretability. Two major interpretability strategies have been proposed to tackle this problem: post-hoc methods and intrinsic methods. Although there are several post-hoc methods to interpret DL models, there is significant variation between the explanations provided by each method, and it a difficult to validate them due to the lack of ground-truth. To address this challenge, we adapted the intrinsical interpretable ProtoPNet for the context of histopathology imaging and compared the attribution maps produced by it and the saliency maps made by post-hoc methods. To evaluate the similarity between saliency map methods and attribution maps we adapted 10 saliency metrics from the saliency model literature, and used the breast cancer metastases detection dataset PatchCamelyon with 327,680 patches of histopathological images of sentinel lymph node sections to validate the proposed approach. Overall, SmoothGrad and Occlusion were found to have a statistically bigger overlap with ProtoPNet while Deconvolution and Lime have been found to have the least.
翻译:尽管卷积神经网络在某些医学任务中已达到人类水平的表现,但其在临床中的应用因缺乏可解释性而受到阻碍。为解决这一问题,学界提出了两类主要可解释性策略:事后方法与内在方法。虽然存在多种用于解释深度学习模型的事后方法,但每种方法提供的解释之间存在显著差异,且由于缺乏真实标注,这些方法难以验证。为应对这一挑战,我们将具有内在可解释性的ProtoPNet适配于组织病理学影像场景,并比较其生成的归因图与事后方法生成的显著性图。为评估显著性图方法与归因图之间的相似性,我们从显著性模型文献中选取了10种显著性度量指标,并利用包含327,680张前哨淋巴结切片组织病理图像补丁的乳腺癌转移检测数据集PatchCamelyon来验证所提方法。整体而言,SmoothGrad与Occlusion与ProtoPNet的重叠在统计上更显著,而Deconvolution与Lime的重叠最小。