Transfer learning has remarkably improved computer vision. These advances also promise improvements in neuroimaging, where training set sizes are often small. However, various difficulties arise in directly applying models pretrained on natural images to radiologic images, such as MRIs. In particular, a mismatch in the input space (2D images vs. 3D MRIs) restricts the direct transfer of models, often forcing us to consider only a few MRI slices as input. To this end, we leverage the 2D-Slice-CNN architecture of Gupta et al. (2021), which embeds all the MRI slices with 2D encoders (neural networks that take 2D image input) and combines them via permutation-invariant layers. With the insight that the pretrained model can serve as the 2D encoder, we initialize the 2D encoder with ImageNet pretrained weights that outperform those initialized and trained from scratch on two neuroimaging tasks -- brain age prediction on the UK Biobank dataset and Alzheimer's disease detection on the ADNI dataset. Further, we improve the modeling capabilities of 2D-Slice models by incorporating spatial information through position embeddings, which can improve the performance in some cases.
翻译:迁移学习显著提升了计算机视觉性能。这些进展也有望改善神经影像学领域的研究,该领域中的训练集规模通常较小。然而,将自然图像预训练模型直接应用于放射影像(如MRI)时存在诸多困难。其中,输入空间不匹配(二维图像 vs. 三维MRI)限制了模型的直接迁移,通常迫使研究者仅能输入少量MRI切片作为模型输入。为此,我们借鉴了Gupta等人(2021)提出的二维切片CNN架构(2D-Slice-CNN),该架构通过二维编码器(即接收二维图像输入的神经网络)嵌入所有MRI切片,并通过置换不变层进行特征融合。基于预训练模型可作为二维编码器的洞察,我们使用ImageNet预训练权重初始化二维编码器。在两项神经影像学任务——基于UK Biobank数据集的脑龄预测和基于ADNI数据集的阿尔茨海默病检测中,该初始化方案的表现均优于随机初始化及从零训练的模型。此外,我们通过位置嵌入引入空间信息,进一步增强了二维切片模型的建模能力,在某些场景下可提升模型性能。