Explainability is a key requirement for computer-aided diagnosis systems in clinical decision-making. Multiple instance learning with attention pooling provides instance-level explainability, however for many clinical applications a deeper, pixel-level explanation is desirable, but missing so far. In this work, we investigate the use of four attribution methods to explain a multiple instance learning models: GradCAM, Layer-Wise Relevance Propagation (LRP), Information Bottleneck Attribution (IBA), and InputIBA. With this collection of methods, we can derive pixel-level explanations on for the task of diagnosing blood cancer from patients' blood smears. We study two datasets of acute myeloid leukemia with over 100 000 single cell images and observe how each attribution method performs on the multiple instance learning architecture focusing on different properties of the white blood single cells. Additionally, we compare attribution maps with the annotations of a medical expert to see how the model's decision-making differs from the human standard. Our study addresses the challenge of implementing pixel-level explainability in multiple instance learning models and provides insights for clinicians to better understand and trust decisions from computer-aided diagnosis systems.
翻译:可解释性是计算机辅助诊断系统在临床决策中的关键要求。采用注意力池化的多实例学习提供了实例级可解释性,然而对于许多临床应用而言,更深层次的像素级解释是可取的,但至今仍缺失。在本研究中,我们探讨使用四种归因方法来解释多实例学习模型:GradCAM、逐层相关性传播(LRP)、信息瓶颈归因(IBA)和InputIBA。借助这些方法,我们能够从患者血液涂片诊断血癌的任务中推导出像素级解释。我们研究了两个包含超过10万个单细胞图像的急性髓系白血病数据集,并观察每种归因方法在多实例学习架构上的表现,重点关注白细胞单细胞的不同特性。此外,我们将归因图与医学专家的标注进行比较,以了解模型决策与人类标准的差异。我们的研究解决了在多实例学习模型中实现像素级可解释性的挑战,并为临床医生更好地理解和信任计算机辅助诊断系统的决策提供了见解。