Self-supervised learning has emerged as a highly effective approach in the fields of natural language processing and computer vision. It is also applicable to brain signals such as electroencephalography (EEG) data, given the abundance of available unlabeled data that exist in a wide spectrum of real-world medical applications ranging from seizure detection to wave analysis. The existing works leveraging self-supervised learning on EEG modeling mainly focus on pretraining upon each individual dataset corresponding to a single downstream task, which cannot leverage the power of abundant data, and they may derive sub-optimal solutions with a lack of generalization. Moreover, these methods rely on end-to-end model learning which is not easy for humans to understand. In this paper, we present a novel EEG foundation model, namely EEGFormer, pretrained on large-scale compound EEG data. The pretrained model cannot only learn universal representations on EEG signals with adaptable performance on various downstream tasks but also provide interpretable outcomes of the useful patterns within the data. To validate the effectiveness of our model, we extensively evaluate it on various downstream tasks and assess the performance under different transfer settings. Furthermore, we demonstrate how the learned model exhibits transferable anomaly detection performance and provides valuable interpretability of the acquired patterns via self-supervised learning.
翻译:自监督学习已在自然语言处理和计算机视觉领域展现出高度有效性。该方法同样适用于脑电图(EEG)等脑信号处理,因为从癫痫检测到波分析等广泛的实际医疗应用中存在大量未标记数据。现有基于自监督学习的EEG建模工作主要聚焦于针对单个下游任务的独立数据集预训练,这种方式无法充分利用海量数据的潜力,且可能因缺乏泛化性而得到次优解。此外,这些方法依赖于端到端模型学习,不利于人类理解。本文提出一种名为EEGFormer的新型EEG基础模型,该模型在大规模复合EEG数据上完成预训练。预训练模型不仅能从EEG信号中学习通用表征,以可适配的性能适用各类下游任务,还可提供数据中有用模式的可解释性结果。为验证模型有效性,我们在多种下游任务上进行了广泛评估,并在不同迁移设置下测试了性能。进一步地,我们展示了所学模型如何通过自监督学习实现可迁移的异常检测性能,并为捕获的模式提供有价值的可解释性。