The interpretability of deep neural networks has become a subject of great interest within the medical and healthcare domain. This attention stems from concerns regarding transparency, legal and ethical considerations, and the medical significance of predictions generated by these deep neural networks in clinical decision support systems. To address this matter, our study delves into the application of four well-established interpretability methods: Local Interpretable Model-agnostic Explanations (LIME), Shapley Additive exPlanations (SHAP), Gradient-weighted Class Activation Mapping (Grad-CAM), and Layer-wise Relevance Propagation (LRP). Leveraging the approach of transfer learning with a multi-label-multi-class chest radiography dataset, we aim to interpret predictions pertaining to specific pathology classes. Our analysis encompasses both single-label and multi-label predictions, providing a comprehensive and unbiased assessment through quantitative and qualitative investigations, which are compared against human expert annotation. Notably, Grad-CAM demonstrates the most favorable performance in quantitative evaluation, while the LIME heatmap segmentation visualization exhibits the highest level of medical significance. Our research highlights the strengths and limitations of these interpretability methods and suggests that a multimodal-based approach, incorporating diverse sources of information beyond chest radiography images, could offer additional insights for enhancing interpretability in the medical domain.
翻译:深度神经网络的可解释性在医疗健康领域引起了极大关注。这种关注源于对临床决策支持系统中深度神经网络生成预测的透明度、法律伦理考量及医学意义的担忧。针对这一问题,本研究深入探讨了四种成熟可解释性方法的应用:局部可解释模型无关解释(LIME)、沙普利加性解释(SHAP)、梯度加权类激活映射(Grad-CAM)及逐层相关性传播(LRP)。通过采用迁移学习方法结合多标签-多分类胸部X光影像数据集,我们旨在解读特定病理类别的预测结果。本分析涵盖单标签与多标签预测,通过与人类专家标注对比,从定量与定性两个维度进行了全面客观的评估。值得注意的是,Grad-CAM在定量评估中表现最优,而LIME热图分割可视化则展现出最高的医学显著性。本研究揭示了这些可解释性方法的优势与局限,并指出:整合胸部X光影像之外多源信息的模态融合方法,有望为提升医学领域可解释性提供新见解。