SpecMoE: Spectral Mixture-of-Experts Foundation Model for Cross-Species EEG Decoding

Decoding the orchestration of neural activity in electroencephalography (EEG) signals is a central challenge in bridging neuroscience with artificial intelligence. Foundation models have made strides in generalized EEG decoding, yet many existing frameworks primarily relying on separate temporal and spectral masking of raw signals during self-supervised pretraining. Such strategies often tend to bias learning toward high-frequency oscillations, as low-frequency rhythmic patterns can be easily inferred from the unmasked signal. We introduce a foundation model that utilizes a novel Gaussian-smoothed masking scheme applied to short-time Fourier transform (STFT) maps. By jointly applying time, frequency, and time-frequency Gaussian masks, we make the reconstruction task much more challenging, forcing the model to learn intricate neural patterns across both high- and low-frequency domains. To effectively recover signals under this aggressive masking strategy, we design SpecHi-Net, a U-shaped hierarchical architecture with multiple encoding and decoding stages. To accelerate large-scale pretraining, we partition the data into three subsets, each used to train an independent expert model. We then combine these models through SpecMoE, a mixture of experts framework guided by a learned spectral gating mechanism. SpecMoE achieves state-of-the-art performance across a diverse set of EEG decoding tasks, including sleep staging, emotion recognition, motor imagery classification, abnormal signal detection, and drug effect prediction. Importantly, the model demonstrates strong cross-species and cross-subject generalization, maintaining high accuracy on both human and murine EEG datasets.

翻译：破解脑电图（EEG）信号中神经活动的协调机制是连接神经科学与人工智能的核心挑战。基础模型在通用脑电解码领域取得进展，但现有框架在自监督预训练中主要依赖对原始信号进行独立的时间与频谱掩码。此类策略常倾向于学习高频振荡特征，因为低频节律模式易于从未掩码信号中推断。我们提出一种基础模型，采用新颖的高斯平滑掩码方案应用于短时傅里叶变换（STFT）图谱。通过联合施加时间、频率及时频高斯掩码，显著提升重构任务难度，迫使模型学习高低频域中复杂的神经模式。为有效恢复此类激进掩码策略下的信号，我们设计了SpecHi-Net——一种包含多编码与多解码阶段的U形层级架构。为加速大规模预训练，我们将数据划分为三个子集，各用于训练独立专家模型。随后通过SpecMoE（一种基于学习谱门控机制的混合专家框架）融合这些模型。SpecMoE在多项脑电解码任务中达到最优性能，包括睡眠分期、情绪识别、运动想象分类、异常信号检测及药物效应预测。尤为重要的是，该模型展现出强大的跨物种与跨主体泛化能力，可在人类与小鼠脑电数据集上保持高精度。