NeuroNarrator: A Generalist EEG-to-Text Foundation Model for Clinical Interpretation via Spectro-Spatial Grounding and Temporal State-Space Reasoning

Electroencephalography (EEG) provides a non-invasive window into neural dynamics at high temporal resolution and plays a pivotal role in clinical neuroscience research. Despite this potential, prevailing computational approaches to EEG analysis remain largely confined to task-specific classification objectives or coarse-grained pattern recognition, offering limited support for clinically meaningful interpretation. To address these limitations, we introduce NeuroNarrator, the first generalist EEG-to-text foundation model designed to translate electrophysiological segments into precise clinical narratives. A cornerstone of this framework is the curation of NeuroCorpus-160K, the first harmonized large-scale resource pairing over 160,000 EEG segments with structured, clinically grounded natural-language descriptions. Our architecture first aligns temporal EEG waveforms with spatial topographic maps via a rigorous contrastive objective, establishing spectro-spatially grounded representations. Building on this grounding, we condition a Large Language Model through a state-space-inspired formulation that integrates historical temporal and spectral context to support coherent clinical narrative generation. This approach establishes a principled bridge between continuous signal dynamics and discrete clinical language, enabling interpretable narrative generation that facilitates expert interpretation and supports clinical reporting workflows. Extensive evaluations across diverse benchmarks and zero-shot transfer tasks highlight NeuroNarrator's capacity to integrate temporal, spectral, and spatial dynamics, positioning it as a foundational framework for time-frequency-aware, open-ended clinical interpretation of electrophysiological data.

翻译：脑电图（EEG）提供了高时间分辨率的非侵入式神经动力学观测窗口，在临床神经科学研究中具有关键作用。尽管潜力巨大，当前主流的EEG计算方法仍局限于特定任务的分类目标或粗粒度模式识别，难以支持具有临床意义的深度判读。为解决上述局限，我们提出NeuroNarrator——首个将电生理片段转化为精确临床描述的通用EEG-文本基础模型。该框架的核心在于构建了NeuroCorpus-160K数据集，即首个经统一标准化处理、涵盖16万余个EEG片段并配有结构化临床自然语言描述的大规模资源。本文架构首先通过严格的对比学习目标将EEG时序波形与空间拓扑图对齐，建立频谱-空间联合表征；在此基础上，通过状态空间范式驱动的公式化方法，整合历史时间与频谱上下文信息，对大语言模型进行条件化约束，以生成连贯的临床叙述。该方法在连续信号动力学与离散临床语言之间建立了原则性桥梁，实现了可解释叙述生成，从而辅助专家判读并支持临床报告工作流。跨多维度基准与零样本迁移任务的广泛评估表明，NeuroNarrator具备整合时间、频谱与空间动力学的综合能力，为电生理数据的时频感知型开放式临床判读奠定了基础框架。