Vertebral fracture grading classifies the severity of vertebral fractures, which is a challenging task in medical imaging and has recently attracted Deep Learning (DL) models. Only a few works attempted to make such models human-interpretable despite the need for transparency and trustworthiness in critical use cases like DL-assisted medical diagnosis. Moreover, such models either rely on post-hoc methods or additional annotations. In this work, we propose a novel interpretable-by-design method, ProtoVerse, to find relevant sub-parts of vertebral fractures (prototypes) that reliably explain the model's decision in a human-understandable way. Specifically, we introduce a novel diversity-promoting loss to mitigate prototype repetitions in small datasets with intricate semantics. We have experimented with the VerSe'19 dataset and outperformed the existing prototype-based method. Further, our model provides superior interpretability against the post-hoc method. Importantly, expert radiologists validated the visual interpretability of our results, showing clinical applicability.
翻译:椎体骨折分级旨在对椎体骨折的严重程度进行分类,这是医学影像领域中的一项挑战性任务,近年来引发了深度学习模型的关注。尽管在深度学习辅助诊断等关键应用场景中需要透明性和可信赖性,但仅有少数研究尝试使这些模型具备人类可解释性。此外,这些模型要么依赖事后解释方法,要么需要额外的标注信息。在本研究中,我们提出了一种新颖的可解释性设计方法——ProtoVerse,该方法能够找到椎体骨折的相关子部分(原型),以人类可理解的方式可靠地解释模型的决策。具体而言,我们引入了一种新型的多样性促进损失函数,以缓解小数据集上语义复杂时原型重复的问题。我们在VerSe'19数据集上进行了实验,并超越了现有的基于原型的方法。此外,我们的模型相比事后解释方法提供了更优的可解释性。重要的是,放射学专家验证了我们结果的可视化可解释性,证明了其临床适用性。