In this work, we study the problem of cross-subject motor imagery (MI) decoding from electroencephalography (EEG) data. Multi-subject EEG datasets present several kinds of domain shifts due to various inter-individual differences (e.g. brain anatomy, personality and cognitive profile). These domain shifts render multi-subject training a challenging task and also impede robust cross-subject generalization. Inspired by the importance of domain generalization techniques for tackling such issues, we propose a two-stage model ensemble architecture built with multiple feature extractors (first stage) and a shared classifier (second stage), which we train end-to-end with two novel loss terms. The first loss applies curriculum learning, forcing each feature extractor to specialize to a subset of the training subjects and promoting feature diversity. The second loss is an intra-ensemble distillation objective that allows collaborative exchange of knowledge between the models of the ensemble. We compare our method against several state-of-the-art techniques, conducting subject-independent experiments on two large MI datasets, namely PhysioNet and OpenBMI. Our algorithm outperforms all of the methods in both 5-fold cross-validation and leave-one-subject-out evaluation settings, using a substantially lower number of trainable parameters. We demonstrate that our model ensembling approach combining the powers of curriculum learning and collaborative training, leads to high learning capacity and robust performance. Our work addresses the issue of domain shifts in multi-subject EEG datasets, paving the way for calibration-free brain-computer interfaces. We make our code publicly available at: https://github.com/gzoumpourlis/Ensemble-MI
翻译:本文研究了基于脑电图(EEG)数据的跨被试运动想象(MI)解码问题。由于个体间差异(如脑解剖结构、人格特征及认知特征)的存在,多被试EEG数据集呈现多种域偏移。这些域偏移使得多被试训练成为一项具有挑战性的任务,同时也阻碍了鲁棒的跨被试泛化。受域泛化技术对于解决此类问题重要性的启发,我们提出了一种两阶段模型集成架构:第一阶段由多个特征提取器构成,第二阶段共享同一个分类器,并通过两个新颖的损失函数实现端到端训练。第一个损失函数采用课程学习策略,强制每个特征提取器专注于训练对象的一个子集,从而促进特征多样性。第二个损失函数是集成内蒸馏目标,允许集成模型之间进行协作知识交换。我们将所提方法与多种先进技术进行了比较,在两个大规模MI数据集(PhysioNet和OpenBMI)上开展了独立于被试的实验。我们的算法在五折交叉验证和留一被试评估两种设置下均优于所有对比方法,且使用的可训练参数数量显著更少。我们证明,结合课程学习与协作训练优势的模型集成方法具有高学习能力和鲁棒性能。本研究解决了多被试EEG数据集中的域偏移问题,为免校准脑机接口铺平了道路。代码已开源在:https://github.com/gzoumpourlis/Ensemble-MI