Foundation models pretrained on large-scale datasets via self-supervised learning demonstrate exceptional versatility across various tasks. Due to the heterogeneity and hard-to-collect medical data, this approach is especially beneficial for medical image analysis and neuroscience research, as it streamlines broad downstream tasks without the need for numerous costly annotations. However, there has been limited investigation into brain network foundation models, limiting their adaptability and generalizability for broad neuroscience studies. In this study, we aim to bridge this gap. In particular, (1) we curated a comprehensive dataset by collating images from 30 datasets, which comprises 70,781 samples of 46,686 participants. Moreover, we introduce pseudo-functional connectivity (pFC) to further generates millions of augmented brain networks by randomly dropping certain timepoints of the BOLD signal. (2) We propose the BrainMass framework for brain network self-supervised learning via mask modeling and feature alignment. BrainMass employs Mask-ROI Modeling (MRM) to bolster intra-network dependencies and regional specificity. Furthermore, Latent Representation Alignment (LRA) module is utilized to regularize augmented brain networks of the same participant with similar topological properties to yield similar latent representations by aligning their latent embeddings. Extensive experiments on eight internal tasks and seven external brain disorder diagnosis tasks show BrainMass's superior performance, highlighting its significant generalizability and adaptability. Nonetheless, BrainMass demonstrates powerful few/zero-shot learning abilities and exhibits meaningful interpretation to various diseases, showcasing its potential use for clinical applications.
翻译:基于大规模数据集通过自监督学习预训练的基座模型在各种任务中展现出卓越的通用性。由于医学数据的异质性和采集困难,该方法对医学图像分析和神经科学研究尤为有益,它无需大量昂贵的标注即可简化广泛的下游任务。然而,目前针对脑网络基座模型的研究十分有限,限制了其在广泛神经科学研究中的适应性和泛化能力。本研究旨在弥补这一空白。具体而言:(1)我们整合了来自30个数据集的数据,构建了一个包含46,686名参与者共70,781个样本的综合数据集。此外,我们提出伪功能连接(pFC),通过随机丢弃BOLD信号的某些时间点,进一步生成数百万个增强脑网络。(2)我们提出了BrainMass框架,通过掩码建模和特征对齐实现脑网络自监督学习。BrainMass采用掩码感兴趣区域建模(MRM)增强网络内部依赖性和区域特异性。同时,利用潜在表示对齐(LRA)模块,通过对齐同一参与者具有相似拓扑特性的增强脑网络的潜在嵌入,使其产生相似的潜在表示。在八个内部任务和七个外部脑疾病诊断任务上的大量实验表明,BrainMass具有卓越性能,凸显其显著的泛化性和适应性。此外,BrainMass展现出强大的少样本/零样本学习能力,并对多种疾病具有有意义的解释能力,展现了其在临床应用中的潜力。