Foundation models are transforming neuroscience but are often prohibitively large, data-hungry, and difficult to deploy. Here, we introduce BrainSymphony, a lightweight and parameter-efficient foundation model with plug-and-play integration of fMRI time series and diffusion-derived structural connectivity, allowing unimodal or multimodal training and deployment without architectural changes while requiring substantially less data compared to the state-of-the-art. The model processes fMRI time series through parallel spatial and temporal transformer streams, distilled into compact embeddings by a Perceiver module, while a novel signed graph transformer encodes anatomical connectivity from diffusion MRI. These complementary representations are then combined through an adaptive fusion mechanism. Despite its compact design, BrainSymphony consistently outperforms larger models on benchmarks spanning prediction, classification, and unsupervised network discovery. Highlighting the model's generalizability and interpretability, attention maps reveal drug-induced context-dependent reorganization of cortical hierarchies in an independent psilocybin neuroimaging dataset. BrainSymphony delivers accessible, interpretable, and clinically meaningful results and demonstrates that architecturally informed, multimodal models can surpass much larger counterparts and advance applications of AI in neuroscience.
翻译:基础模型正在变革神经科学领域,但其通常规模庞大、数据需求高且难以部署。本文提出BrainSymphony,这是一种轻量级且参数高效的基础模型,具备功能磁共振成像时间序列与弥散成像衍生的结构连接性的即插即用集成能力,支持单模态或多模态训练与部署而无需改变架构,同时相比现有最优方法所需数据量显著减少。该模型通过并行的空间与时间Transformer流处理fMRI时间序列,经由Perceiver模块蒸馏为紧凑嵌入表示,同时采用新型符号图Transformer对弥散MRI获取的解剖连接性进行编码。这些互补表征随后通过自适应融合机制进行整合。尽管设计紧凑,BrainSymphony在涵盖预测、分类和无监督网络发现的基准测试中持续超越更大规模的模型。通过注意力图谱揭示独立裸盖菇素神经影像数据集中药物诱导的皮层层级结构情境依赖性重组,彰显了模型的泛化能力与可解释性。BrainSymphony提供了可访问、可解释且具有临床意义的结果,并证明基于架构设计的多模态模型能够超越更大型的对应模型,推动人工智能在神经科学中的应用。