To handle the scarcity and heterogeneity of electroencephalography (EEG) data for Brain-Computer Interface (BCI) tasks, and to harness the power of large publicly available data sets, we propose Neuro-GPT, a foundation model consisting of an EEG encoder and a GPT model. The foundation model is pre-trained on a large-scale data set using a self-supervised task that learns how to reconstruct masked EEG segments. We then fine-tune the model on a Motor Imagery Classification task to validate its performance in a low-data regime (9 subjects). Our experiments demonstrate that applying a foundation model can significantly improve classification performance compared to a model trained from scratch, which provides evidence for the generalizability of the foundation model and its ability to address challenges of data scarcity and heterogeneity in EEG. The code is publicly available at github.com/wenhui0206/NeuroGPT.
翻译:为解决脑机接口任务中脑电图数据稀缺与异质性问题,并充分利用大规模公开数据集,我们提出Neuro-GPT——一种由脑电编码器与GPT模型构成的基础模型。该基础模型通过自监督任务在大型数据集上进行预训练,学习重建被掩码的脑电图片段。随后,我们将其在运动想象分类任务上进行微调,以验证其在低数据量场景(9名受试者)中的性能。实验表明,与从头训练的模型相比,应用基础模型能够显著提升分类性能,这为基础模型的泛化能力及其应对脑电图数据稀缺与异质性挑战的有效性提供了证据。代码已开源至github.com/wenhui0206/NeuroGPT。