The field of computer vision is undergoing a paradigm shift toward large-scale foundation model pre-training via self-supervised learning (SSL). Leveraging large volumes of unlabeled brain MRI data, such models can learn anatomical priors that improve few-shot performance in diverse neuroimaging tasks. However, most SSL frameworks are tailored to natural images, and their adaptation to capture multi-modal MRI information remains underexplored. This work proposes a modality-invariant representation learning setup and evaluates its effectiveness in stroke and epilepsy lesion segmentation, following large-scale pre-training. Experimental results suggest that despite successful cross-modality alignment, lesion segmentation primarily benefits from preserving fine-grained modality-specific features. Model checkpoints and code are made publicly available.
翻译:计算机视觉领域正经历着通过自监督学习进行大规模基础模型预训练的范式转变。利用大量未标注的脑部MRI数据,此类模型能够学习解剖学先验知识,从而提升在多种神经影像任务中的少样本性能。然而,大多数自监督学习框架是针对自然图像设计的,其在捕获多模态MRI信息方面的适应性仍未得到充分探索。本研究提出了一种模态不变表示学习框架,并在大规模预训练后评估了其在脑卒中与癫痫病灶分割任务中的有效性。实验结果表明,尽管实现了成功的跨模态对齐,但病灶分割主要受益于保留细粒度的模态特异性特征。模型检查点与代码均已公开提供。