This study presents a novel approach for EEG-based seizure detection leveraging a BERT-based model. The model, BENDR, undergoes a two-phase training process. Initially, it is pre-trained on the extensive Temple University Hospital EEG Corpus (TUEG), a 1.5 TB dataset comprising over 10,000 subjects, to extract common EEG data patterns. Subsequently, the model is fine-tuned on the CHB-MIT Scalp EEG Database, consisting of 664 EEG recordings from 24 pediatric patients, of which 198 contain seizure events. Key contributions include optimizing fine-tuning on the CHB-MIT dataset, where the impact of model architecture, pre-processing, and post-processing techniques are thoroughly examined to enhance sensitivity and reduce false positives per hour (FP/h). We also explored custom training strategies to ascertain the most effective setup. The model undergoes a novel second pre-training phase before subject-specific fine-tuning, enhancing its generalization capabilities. The optimized model demonstrates substantial performance enhancements, achieving as low as 0.23 FP/h, 2.5$\times$ lower than the baseline model, with a lower but still acceptable sensitivity rate, showcasing the effectiveness of applying a BERT-based approach on EEG-based seizure detection.
翻译:本研究提出了一种利用基于BERT的模型进行基于脑电图(EEG)的癫痫发作检测的新方法。该模型BENDR经历两阶段训练过程。首先,在庞大的天普大学医院脑电图语料库(TUEG)上进行预训练,该数据集容量达1.5 TB,包含超过10,000名受试者的数据,以提取常见的脑电图数据模式。随后,该模型在CHB-MIT头皮脑电图数据库上进行微调,该数据库包含来自24名儿科患者的664条脑电图记录,其中198条包含癫痫发作事件。主要贡献包括优化在CHB-MIT数据集上的微调过程,其中对模型架构、预处理和后处理技术的影响进行了深入研究,以提高灵敏度并降低每小时误报率(FP/h)。我们还探索了定制训练策略以确定最有效的设置。该模型在进行特定受试者微调前,经历了一个新颖的第二预训练阶段,从而增强了其泛化能力。优化后的模型展现出显著的性能提升,实现了低至0.23 FP/h的误报率,比基线模型降低了2.5倍,同时保持了较低但仍可接受的灵敏度,这证明了将基于BERT的方法应用于基于脑电图的癫痫发作检测的有效性。