Foundation models are emerging as a powerful paradigm for fMRI analysis, but current approaches face a dual bottleneck of data- and training-efficiency. Atlas-based methods aggregate voxel signals into fixed regions of interest, reducing data dimensionality but discarding fine-grained spatial details, and requiring extremely large cohorts to train effectively as general-purpose foundation models. Atlas-free methods, on the other hand, operate directly on voxel-level information - preserving spatial fidelity but are prohibitively memory- and compute-intensive, making large-scale pre-training infeasible. We introduce SLIM-Brain (Sample-efficient, Low-memory fMRI Foundation Model for Human Brain), a new atlas-free foundation model that simultaneously improves both data- and training-efficiency. SLIM-Brain adopts a two-stage adaptive design: (i) a lightweight temporal extractor captures global context across full sequences and ranks data windows by saliency, and (ii) a 4D hierarchical encoder (Hiera-JEPA) learns fine-grained voxel-level representations only from the top-$k$ selected windows, while deleting about 70% masked patches. Extensive experiments across seven public benchmarks show that SLIM-Brain establishes new state-of-the-art performance on diverse tasks, while requiring only 4 thousand pre-training sessions and approximately 30% of GPU memory comparing to traditional voxel-level methods.
翻译:基础模型正成为fMRI分析的一种强大范式,但现有方法面临着数据效率与训练效率的双重瓶颈。基于图谱的方法将体素信号聚合到固定的感兴趣区域,降低了数据维度,但丢弃了细粒度的空间细节,并且需要极大的样本量才能有效训练为通用基础模型。另一方面,无图谱方法直接在体素级信息上操作——保留了空间保真度,但其内存和计算需求极其高昂,使得大规模预训练难以实现。我们提出了SLIM-Brain(面向人脑的样本高效、低内存fMRI基础模型),这是一种新的无图谱基础模型,能同时提升数据效率与训练效率。SLIM-Brain采用两阶段自适应设计:(i)一个轻量级时序提取器捕获全序列的全局上下文,并根据显著性对数据窗口进行排序;(ii)一个4D分层编码器(Hiera-JEPA)仅从选定的前$k$个窗口中学习细粒度的体素级表征,同时删除约70%的掩码补丁。在七个公开基准上的大量实验表明,SLIM-Brain在多种任务上建立了新的最先进性能,而预训练仅需4千个会话,且与传统体素级方法相比,GPU内存需求仅为其约30%。