Reconstructing natural language from non-invasive electroencephalography (EEG) holds great promise as a language decoding technology for brain-computer interfaces (BCIs). However, EEG-based language decoding is still in its nascent stages, facing several technical issues such as: 1) Absence of a hybrid strategy that can effectively integrate cross-modality (between EEG and text) self-learning with intra-modality self-reconstruction of EEG features or textual sequences; 2) Under-utilization of large language models (LLMs) to enhance EEG-based language decoding. To address above issues, we propose the Contrastive EEG-Text Masked Autoencoder (CET-MAE), a novel model that orchestrates compound self-supervised learning across and within EEG and text through a dedicated multi-stream encoder. Furthermore, we develop a framework called E2T-PTR (EEG-to-Text decoding using Pretrained Transferable Representations), which leverages pre-trained modules alongside the EEG stream from CET-MAE and further enables an LLM (specifically BART) to decode text from EEG sequences. Comprehensive experiments conducted on the popular text-evoked EEG database, ZuCo, demonstrate the superiority of E2T-PTR, which outperforms the state-of-the-art in ROUGE-1 F1 and BLEU-4 scores by 8.34% and 32.21%, respectively. These results indicate significant advancements in the field and underscores the proposed framework's potential to enable more powerful and widespread BCI applications.
翻译:从非侵入性脑电图(EEG)重建自然语言作为脑机接口(BCIs)的语言解码技术具有广阔前景。然而,基于EEG的语言解码仍处于初期阶段,面临诸多技术问题,包括:1)缺乏能够有效融合跨模态(EEG与文本)自学习与EEG特征或文本序列的模态内自重建的混合策略;2)大型语言模型(LLMs)在增强EEG语言解码中的利用不足。针对上述问题,我们提出对比EEG-文本掩码自编码器(CET-MAE),这是一种通过专用多流编码器协调EEG与文本跨模态及模态内复合自监督学习的新型模型。进一步地,我们开发了名为E2T-PTR(基于预训练可迁移表征的EEG至文本解码)的框架,该框架利用来自CET-MAE的预训练模块以及EEG流,使大型语言模型(具体为BART)能够解码EEG序列中的文本。在广泛使用的文本诱发EEG数据集ZuCo上进行的综合实验表明,E2T-PTR具有优越性,其在ROUGE-1 F1和BLEU-4指标上分别超越当前最佳水平8.34%和32.21%。这些结果标志着该领域的重大进展,并凸显了所提框架在推动更强大、更广泛的BCI应用方面的潜力。