SpecMoE: Spectral Mixture-of-Experts Foundation Model for Cross-Species EEG Decoding

Decoding the orchestration of neural activity in electroencephalography (EEG) signals is a central challenge in bridging neuroscience with artificial intelligence. Foundation models have made strides in generalized EEG decoding, yet many existing frameworks primarily relying on separate temporal and spectral masking of raw signals during self-supervised pretraining. Such strategies often tend to bias learning toward high-frequency oscillations, as low-frequency rhythmic patterns can be easily inferred from the unmasked signal. We introduce a foundation model that utilizes a novel Gaussian-smoothed masking scheme applied to short-time Fourier transform (STFT) maps. By jointly applying time, frequency, and time-frequency Gaussian masks, we make the reconstruction task much more challenging, forcing the model to learn intricate neural patterns across both high- and low-frequency domains. To effectively recover signals under this aggressive masking strategy, we design SpecHi-Net, a U-shaped hierarchical architecture with multiple encoding and decoding stages. To accelerate large-scale pretraining, we partition the data into three subsets, each used to train an independent expert model. We then combine these models through SpecMoE, a mixture of experts framework guided by a learned spectral gating mechanism. SpecMoE achieves state-of-the-art performance across a diverse set of EEG decoding tasks, including sleep staging, emotion recognition, motor imagery classification, abnormal signal detection, and drug effect prediction. Importantly, the model demonstrates strong cross-species and cross-subject generalization, maintaining high accuracy on both human and murine EEG datasets.

翻译：解码脑电图（EEG）信号中神经活动的协同机制，是连接神经科学与人工智能的核心挑战。基础模型在通用脑电解码方面已取得进展，但现有框架大多依赖于在自监督预训练阶段对原始信号分别进行时域和频域掩码。此类策略往往使学习偏向高频振荡，因为低频节律模式易于从未掩码信号中推断。本文提出一种基础模型，其采用一种新颖的高斯平滑掩码方案，应用于短时傅里叶变换（STFT）谱图。通过联合使用时域、频域及时频联合高斯掩码，我们使重建任务更具挑战性，迫使模型学习跨越高频与低频域的复杂神经模式。为在此强掩码策略下有效恢复信号，我们设计了SpecHi-Net——一种具有多级编码与解码阶段的U型层次化架构。为加速大规模预训练，我们将数据划分为三个子集，分别用于训练独立的专家模型。随后通过SpecMoE——一种由学习的谱门控机制引导的混合专家框架——将这些模型集成。SpecMoE在多样化的脑电解码任务中实现了最先进的性能，包括睡眠分期、情绪识别、运动想象分类、异常信号检测及药物效应预测。重要的是，该模型展现出强大的跨物种与跨被试泛化能力，在人类与小鼠脑电数据集上均保持高准确率。