Beyond Accuracy: Robustness, Interpretability and Expressiveness of EEG Foundation Models

EEG foundation models (EEG-FMs) have been evaluated predominantly on clean, in-distribution accuracy, leaving their robustness, interpretability and representational quality largely unexamined. This study addresses these gaps by benchmarking six EEG-FMs against a baseline deep learning model across eight datasets. Beyond clean accuracy, we conduct three layers of analysis: (i) Robustness: we apply test-time perturbations including additive noise, random and region-based channel dropout and region-specific noise injection. Our analyses show that no single model dominates all failure modes. The most noise-robust model is among the most fragile under channel dropout and much of the dropout fragility disappears when channels are removed rather than zero-padded. (ii) Interpretability: we present the first application of Attention-Aware Layer-Wise Relevance Propagation (AttnLRP) to EEG-FMs and show that models broadly concentrate relevance on task-appropriate brain regions consistent with known neurophysiology. However, attribution maps remain spatially stable under perturbation while predictions degrade, suggesting that the models attend to the correct brain regions but decode corrupted content. (iii) Expressiveness: With block-wise probing we show that late blocks are repurposed during fine-tuning, while early blocks already hold task-related information. Furthermore, we demonstrate that the poor head-only performance previously attributed to low-quality pre-trained representations is largely explained by pooling and that EEG-FMs possess sufficient representational capacity when their token-level embeddings are preserved. Together, these findings provide the first systematic assessment of robustness, interpretability and expressiveness for EEG-FMs and highlight critical considerations for their development.

翻译：脑电图基础模型（EEG-FMs）主要基于干净且分布内数据的准确率进行评估，而其鲁棒性、可解释性与表征质量在很大程度上尚未得到检验。本研究通过将六种EEG-FM与一个基线深度学习模型在八个数据集上进行基准测试，填补了这些空白。除了干净数据上的准确率，我们开展了三个层面的分析：（i）鲁棒性：我们施加了测试时扰动，包括加性噪声、随机和基于区域的信道丢弃以及特定区域的噪声注入。分析表明，没有单一模型在所有失效模式下占据主导地位。抗噪声最强的模型在信道丢弃情境下最为脆弱，且当信道被移除而非零填充时，大部分丢弃脆弱性消失。（ii）可解释性：我们首次将注意力感知的逐层相关性传播（AttnLRP）应用于EEG-FM，并表明这些模型普遍将相关性集中在与已知神经生理学一致、任务相关的大脑区域。然而，在预测性能下降的情况下，归因图在扰动下仍保持空间稳定性，这表明模型关注了正确的脑区但解码了受损的内容。（iii）表达能力：通过分块探查，我们发现微调期间后期模块被重新利用，而早期模块已包含任务相关信息。此外，我们证明先前归因于低质量预训练表示的弱头部性能在很大程度上可由池化操作解释，并且当保留其令牌级嵌入时，EEG-FM具备足够的表征能力。综合而言，这些发现首次系统评估了EEG-FM的鲁棒性、可解释性与表达能力，并为其开发提出了关键考量。

相关内容

MoDELS

关注 46

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

基于 Transformer 的脑电解码综述询问 ChatGPT

专知会员服务

12+阅读 · 2025年7月6日

知识图谱基础模型的数学基础

专知会员服务

41+阅读 · 2025年1月12日