In the past five years, the use of generative and foundational AI systems has greatly improved the decoding of brain activity. Visual perception, in particular, can now be decoded from functional Magnetic Resonance Imaging (fMRI) with remarkable fidelity. This neuroimaging technique, however, suffers from a limited temporal resolution ($\approx$0.5 Hz) and thus fundamentally constrains its real-time usage. Here, we propose an alternative approach based on magnetoencephalography (MEG), a neuroimaging device capable of measuring brain activity with high temporal resolution ($\approx$5,000 Hz). For this, we develop an MEG decoding model trained with both contrastive and regression objectives and consisting of three modules: i) pretrained embeddings obtained from the image, ii) an MEG module trained end-to-end and iii) a pretrained image generator. Our results are threefold: Firstly, our MEG decoder shows a 7X improvement of image-retrieval over classic linear decoders. Second, late brain responses to images are best decoded with DINOv2, a recent foundational image model. Third, image retrievals and generations both suggest that high-level visual features can be decoded from MEG signals, although the same approach applied to 7T fMRI also recovers better low-level features. Overall, these results, while preliminary, provide an important step towards the decoding -- in real-time -- of the visual processes continuously unfolding within the human brain.
翻译:过去五年中,生成式与基础性人工智能系统的应用极大地提升了大脑活动的解码能力。特别是视觉感知,如今已能从功能性磁共振成像(fMRI)中以显著保真度进行解码。然而,这种神经影像技术受限于较低的时间分辨率(约0.5赫兹),从根本上制约了其实时应用。为此,我们提出了一种基于脑磁图(MEG)的替代方案——这种神经影像设备能以高时间分辨率(约5,000赫兹)测量脑活动。我们开发了一种MEG解码模型,该模型采用对比学习与回归目标进行联合训练,由三个模块组成:i) 从图像中获取的预训练嵌入,ii) 端到端训练的MEG模块,以及iii) 预训练图像生成器。研究取得三重成果:首先,与经典线性解码器相比,我们的MEG解码器在图像检索性能上提升了7倍;其次,使用近期基础图像模型DINOv2能最佳地解码图像引发的晚期脑响应;第三,图像检索与生成结果均表明,虽然相同方法应用于7T fMRI时能更有效地恢复低层视觉特征,但高层视觉特征确实可从MEG信号中解码。总体而言,这些初步结果为实时解码持续在人脑内展开的视觉过程迈出了重要一步。