Masked and Predictive Self-Supervised Foundation Models for 3D Brain MRI

Self-supervised foundation models have shown strong promise in medical imaging. However, existing MRI foundation-model studies have primarily emphasized segmentation and dense prediction tasks, while systematic investigation of self-supervised foundation models for MRI-based disease detection remains limited. In this work, we investigate two major self-supervised pretraining paradigms for MRI-based disease detection: reconstruction-based learning via Masked Autoencoders (MAE) and predictive representation learning via Joint Embedding Predictive Architectures (JEPA). We study the role of auxiliary objectives by introducing a novel spectral-domain reconstruction loss for MAE to enhance sensitivity to fine-grained anatomical structure, and by integrating variance--covariance regularization (VCR) within our JEPA framework to encourage decorrelated latent representations. Our models are pretrained on heterogeneous single-contrast MRI volumes in a contrast-agnostic setting, without modality concatenation. Across five downstream disease detection tasks, our results highlight the importance of self-supervised objective design for medical foundation model pretraining, demonstrating that the downstream benefit of each objective is determined by its relevance to the task's structure. Specifically, spectral regularization yields the largest improvements when the downstream discriminative signal is characterized by strong high-frequency anatomical structures, while covariance regularization is most beneficial when discriminative information spans multiple decorrelated feature dimensions. MAE with spectral-domain supervision consistently achieves superior downstream performance for MRI-based disease detection. These findings suggest that self-supervised objectives in medical imaging encode specific biases, and their downstream benefit is fundamentally conditioned on the task's structure.

翻译：自监督基础模型在医学影像领域展现出巨大潜力。然而，现有MRI基础模型研究主要集中于分割与密集预测任务，对基于MRI疾病检测的自监督基础模型系统研究仍较为有限。本研究针对基于MRI疾病检测的两大主流自监督预训练范式展开探索：基于掩码自编码器（MAE）的重构式学习，以及基于联合嵌入预测架构（JEPA）的预测式表征学习。我们通过引入新颖的谱域重构损失函数以增强对精细解剖结构的敏感性（用于MAE），并在JEPA框架中整合方差-协方差正则化（VCR）以促进去相关潜在表征，研究了辅助目标函数的作用。模型采用对比不敏感设置，在异质单对比度MRI体数据上预训练，避免模态级联。在五项下游疾病检测任务中，研究结果揭示了自监督目标函数设计对医学基础模型预训练的重要性，证明各目标函数的下游效益取决于其与任务结构的关联性。具体而言：当下游判别信号以强高频解剖结构为特征时，谱域正则化带来最大提升；而当判别信息分布在多个去相关特征维度时，协方差正则化效益最为显著。采用谱域监督的MAE始终在基于MRI的疾病检测中取得更优下游性能。这些发现表明，医学影像中的自监督目标函数编码特定偏差，其下游效益根本上受限于任务结构。