Reconstructing natural language from non-invasive electroencephalography (EEG) holds great promise as a language decoding technology for brain-computer interfaces (BCIs). However, EEG-based language decoding is still in its nascent stages, facing several technical issues such as: 1) Absence of a hybrid strategy that can effectively integrate cross-modality (between EEG and text) self-learning with intra-modality self-reconstruction of EEG features or textual sequences; 2) Under-utilization of large language models (LLMs) to enhance EEG-based language decoding. To address above issues, we propose the Contrastive EEG-Text Masked Autoencoder (CET-MAE), a novel model that orchestrates compound self-supervised learning across and within EEG and text through a dedicated multi-stream encoder. Furthermore, we develop a framework called E2T-PTR (EEG-to-Text decoding using Pretrained Transferable Representations), which leverages pre-trained modules alongside the EEG stream from CET-MAE and further enables an LLM (specifically BART) to decode text from EEG sequences. Comprehensive experiments conducted on the popular text-evoked EEG database, ZuCo, demonstrate the superiority of E2T-PTR, which outperforms the state-of-the-art in ROUGE-1 F1 and BLEU-4 scores by 8.34% and 32.21%, respectively. These results indicate significant advancements in the field and underscores the proposed framework's potential to enable more powerful and widespread BCI applications.
翻译:从非侵入性脑电图(EEG)中重建自然语言作为脑机接口(BCI)的语言解码技术具有广阔前景。然而,基于EEG的语言解码仍处于初级阶段,面临多项技术难题:1)缺乏能够有效整合跨模态(EEG与文本间)自学习与模态内EEG特征或文本序列自重建的混合策略;2)未充分挖掘大型语言模型(LLM)增强基于EEG的语言解码的潜力。针对上述问题,我们提出对比EEG-文本掩码自编码器(CET-MAE),这是一种通过专用多流编码器在EEG与文本的跨模态和模态内实现复合自监督学习的新型模型。进一步,我们构建了名为E2T-PTR(基于预训练可迁移表征的EEG到文本解码)的框架,该框架利用CET-MAE中EEG流的预训练模块,并支持LLM(具体为BART)从EEG序列解码文本。在广泛使用的文本诱发EEG数据库ZuCo上的全面实验表明,E2T-PTR具有优越性能,其ROUGE-1 F1和BLEU-4分数分别超越当前最优方法8.34%和32.21%。这些结果标志着该领域的重大进展,并凸显所提框架在推动更强大、更广泛BCI应用中的潜力。