Speech neuroprosthesis systems decode intended speech from neural activity in the absence of audible output, offering a path to restoring communication for individuals with speech-impairing conditions. Current approaches decode predominantly from motor cortical areas, discarding others -- such as area 44, part of Broca's area -- that may encode complementary linguistic information. We introduce MoDAl (Modality Decorrelation and Alignment), a framework that discovers complementary neural modalities through the interplay of two objectives in a shared projection space. A contrastive loss aligns each of several parallel brain encoders with the text embeddings of a pretrained large language model (LLM), while a decorrelation loss prevents the encoders from coalescing to duplicative representations. We prove that these objectives are in productive tension: Contrastive alignment induces transitive modality coalescence, which decorrelation must counteract for the framework to discover diverse neurolinguistic modalities. On the Brain-to-Text Benchmark '24, MoDAl reduces word error rate (WER) from 26.3% to 21.6% compared to the previous best end-to-end method, with the gain from incorporating previously discarded area 44 signals arising entirely from the decorrelation mechanism. Analysis of the discovered modalities reveals functional specialization: Encoders receiving area 44 input capture structural and syntactic properties (sentence length, grammatical voice, wh-words), consistent with the neurolinguistic understanding of Broca's area.
翻译:言语神经假体系统在没有可听输出的情况下,从神经活动中解码意图性言语,为言语障碍者恢复交流能力提供了途径。现有方法主要从运动皮层区域进行解码,而忽略了其他可能编码互补语言信息的脑区——如布罗卡区的44区。我们提出MoDAl(模态去相关与对齐)框架,通过共享投影空间中两个目标的相互作用发现互补的神经模态。对比损失将多个并行脑编码器与预训练大语言模型的文本嵌入对齐,而去相关损失则防止编码器形成冗余表征。我们证明这两个目标存在建设性张力:对比对齐会诱导模态的传递性聚合,而框架必须通过去相关来对抗这一过程,以发现多样化的神经语言学模态。在2024年脑到文本基准测试中,MoDAl将词错误率从26.3%降至21.6%(相比此前最佳端到端方法),其性能提升完全源于去相关机制对先前被忽略的44区信号的整合。对发现模态的分析揭示了功能特化:接收44区输入的编码器捕获了结构与句法属性(句子长度、语法语态、疑问词),这与布罗卡区的神经语言学认知高度一致。