Transfer learning has become an essential part of medical imaging classification algorithms, often leveraging ImageNet weights. However, the domain shift from natural to medical images has prompted alternatives such as RadImageNet, often demonstrating comparable classification performance. However, it remains unclear whether the performance gains from transfer learning stem from improved generalization or shortcut learning. To address this, we investigate potential confounders -- whether synthetic or sampled from the data -- across two publicly available chest X-ray and CT datasets. We show that ImageNet and RadImageNet achieve comparable classification performance, yet ImageNet is much more prone to overfitting to confounders. We recommend that researchers using ImageNet-pretrained models reexamine their model robustness by conducting similar experiments. Our code and experiments are available at https://github.com/DovileDo/source-matters.
翻译:迁移学习已成为医学影像分类算法的重要组成部分,通常借助ImageNet权重实现。然而,从自然图像到医学图像的领域迁移催生了诸如RadImageNet等替代方案,这些方法常展现出同等的分类性能。但迁移学习带来的性能提升究竟源于泛化能力增强还是捷径学习,目前尚不明确。为此,我们基于两个公开的胸部X光和CT数据集,研究了潜在混杂因素——无论是合成数据还是从原始数据中采样得到——的影响。结果表明,ImageNet和RadImageNet虽能达到同等的分类性能,但ImageNet模型更容易对混杂因素产生过拟合。我们建议使用ImageNet预训练模型的研究者通过开展类似实验重新审视模型的鲁棒性。相关代码与实验详见https://github.com/DovileDo/source-matters。