Decoding language from neural signals holds considerable theoretical and practical importance. Previous research has indicated the feasibility of decoding text or speech from invasive neural signals. However, when using non-invasive neural signals, significant challenges are encountered due to their low quality. In this study, we proposed a data-driven approach for decoding semantic of language from Magnetoencephalography (MEG) signals recorded while subjects were listening to continuous speech. First, a multi-subject decoding model was trained using contrastive learning to reconstruct continuous word embeddings from MEG data. Subsequently, a beam search algorithm was adopted to generate text sequences based on the reconstructed word embeddings. Given a candidate sentence in the beam, a language model was used to predict the subsequent words. The word embeddings of the subsequent words were correlated with the reconstructed word embedding. These correlations were then used as a measure of the probability for the next word. The results showed that the proposed continuous word embedding model can effectively leverage both subject-specific and subject-shared information. Additionally, the decoded text exhibited significant similarity to the target text, with an average BERTScore of 0.816, a score comparable to that in the previous fMRI study.
翻译:从神经信号中解码语言具有重要的理论和实践价值。先前研究表明,从侵入式神经信号中解码文本或语音具有可行性。然而,当使用非侵入式神经信号时,由于其低质量特性,面临着重大挑战。在本研究中,我们提出了一种数据驱动的方法,用于从受试者收听连续语音时记录的脑磁图(MEG)信号中解码语言的语义。首先,通过对比学习训练了一个多受试者解码模型,以从MEG数据中重建连续的词嵌入。随后,采用束搜索算法基于重建的词嵌入生成文本序列。对于束中的候选句子,使用语言模型预测后续词语。后续词语的词嵌入与重建的词嵌入进行相关性分析,这些相关性被用作下一个词的概率度量。结果表明,所提出的连续词嵌入模型能够有效利用受试者特定和受试者共享信息。此外,解码文本与目标文本具有显著相似性,平均BERTScore达到0.816,与先前fMRI研究中的得分相当。