How to decode human vision through neural signals has attracted a long-standing interest in neuroscience and machine learning. Modern contrastive learning and generative models improved the performance of fMRI-based visual decoding and reconstruction. However, the high cost and low temporal resolution of fMRI limit their applications in brain-computer interfaces (BCIs), prompting a high need for EEG-based visual reconstruction. In this study, we present an EEG-based visual reconstruction framework. It consists of a plug-and-play EEG encoder called the Adaptive Thinking Mapper (ATM), which is aligned with image embeddings, and a two-stage EEG guidance image generator that first transforms EEG features into image priors and then reconstructs the visual stimuli with a pre-trained image generator. Our approach allows EEG embeddings to achieve superior performance in image classification and retrieval tasks. Our two-stage image generation strategy vividly reconstructs images seen by humans. Furthermore, we analyzed the impact of signals from different time windows and brain regions on decoding and reconstruction. The versatility of our framework is demonstrated in the magnetoencephalogram (MEG) data modality. We report that EEG-based visual decoding achieves SOTA performance, highlighting the portability, low cost, and high temporal resolution of EEG, enabling a wide range of BCI applications. The code of ATM is available at https://github.com/dongyangli-del/EEG_Image_decode.
翻译:如何通过神经信号解码人类视觉长期吸引着神经科学与机器学习的兴趣。现代对比学习与生成模型提升了基于功能磁共振成像(fMRI)的视觉解码与重建性能。然而,fMRI的高成本与低时间分辨率限制了其在脑机接口(BCI)中的应用,这使得基于脑电图(EEG)的视觉重建需求迫在眉睫。本研究提出了一种基于EEG的视觉重建框架。该框架包含一个即插即用的EEG编码器——自适应思维映射器(ATM),其与图像嵌入对齐,以及一个两阶段EEG引导图像生成器:首先将EEG特征转化为图像先验,再通过预训练图像生成器重建视觉刺激。我们的方法使EEG嵌入在图像分类与检索任务中达到卓越性能。两阶段图像生成策略能够生动重建人类所见的图像。此外,我们分析了不同时间窗口与脑区信号对解码与重建的影响。该框架在脑磁图(MEG)数据模态中同样展现了通用性。研究结果表明,基于EEG的视觉解码达到了当前最优(SOTA)性能,突显了EEG的可便携性、低成本与高时间分辨率,可支撑广泛的BCI应用。ATM代码见 https://github.com/dongyangli-del/EEG_Image_decode。