Deep learning models in medical imaging are susceptible to shortcut learning, relying on confounding metadata (e.g., scanner model) that is often encoded in image embeddings. The crucial question is whether the model actively utilizes this encoded information for its final prediction. We introduce Weight Space Correlation Analysis, an interpretable methodology that quantifies feature utilization by measuring the alignment between the classification heads of a primary clinical task and auxiliary metadata tasks. We first validate our method by successfully detecting artificially induced shortcut learning. We then apply it to probe the feature utilization of an SA-SonoNet model trained for Spontaneous Preterm Birth (sPTB) prediction. Our analysis confirmed that while the embeddings contain substantial metadata, the sPTB classifier's weight vectors were highly correlated with clinically relevant factors (e.g., birth weight) but decoupled from clinically irrelevant acquisition factors (e.g. scanner). Our methodology provides a tool to verify model trustworthiness, demonstrating that, in the absence of induced bias, the clinical model selectively utilizes features related to the genuine clinical signal.
翻译:医学影像中的深度学习模型容易陷入捷径学习,依赖常被编码于图像嵌入中的混杂元数据(如扫描仪型号)。关键问题在于模型是否主动利用这些编码信息进行最终预测。我们提出权重空间相关性分析,这是一种可解释的方法论,通过测量主要临床任务分类头与辅助元数据任务分类头之间的对齐程度来量化特征利用。我们首先通过成功检测人工诱导的捷径学习验证了该方法。随后将其应用于探究为自发性早产预测训练的SA-SonoNet模型的特征利用情况。分析证实:虽然嵌入包含大量元数据,但sPTB分类器的权重向量与临床相关因素(如出生体重)高度相关,而与临床无关的采集因素(如扫描仪)解耦。本方法论为验证模型可信度提供了工具,证明在无诱导偏差的情况下,临床模型会选择性地利用与真实临床信号相关的特征。