A growing interest has developed in the problem of training models of EEG features to predict brain activity measured using fMRI, i.e. the problem of EEG-to-fMRI synthesis. Despite some reported success, the statistical significance and generalizability of EEG-to-fMRI predictions remains to be fully demonstrated. Here, we investigate the predictive power of EEG for both task-evoked and spontaneous activity of the somatomotor network measured by fMRI, based on data collected from healthy subjects in two different sessions. We trained subject-specific distributed-lag linear models of time-varying, multi-channel EEG spectral power using Sparse Group LASSO regularization, and we showed that learned models outperformed conventional EEG somatomotor rhythm predictors as well as massive univariate correlation models. Furthermore, we showed that learned models were statistically significantly better than appropriate null models in most subjects and conditions, although less frequently for spontaneous compared to task-evoked activity. Critically, predictions improved significantly when training and testing on data acquired in the same session relative to across sessions, highlighting the importance of temporally separating the collection of train and test data to avoid data leakage and optimistic bias in model generalization. In sum, while we demonstrate that EEG models can provide fMRI predictions with statistical significance, we also show that predictive power is impaired for spontaneous fluctuations in brain activity and for models trained on data acquired in a different session. Our findings highlight the need to explicitly consider these often overlooked issues in the growing literature of EEG-to-fMRI synthesis.
翻译:近年来,训练基于脑电图特征预测功能磁共振成像所测脑活动的模型(即脑电图到功能磁共振成像合成问题)引起了日益广泛的关注。尽管已有研究报道了部分成功案例,但脑电图到功能磁共振成像预测的统计显著性与泛化能力仍需得到充分验证。本研究基于健康受试者在两个独立会话中采集的数据,探究了脑电图对功能磁共振成像测量的体感运动网络任务诱发活动与自发活动的预测能力。我们采用稀疏群LASSO正则化方法,建立了受试者特定的时变多通道脑电图频谱功率分布滞后线性模型,并证明所训练模型在预测性能上优于传统的脑电图体感运动节律预测器及大规模单变量相关模型。进一步研究表明,在多数受试者及实验条件下,学习所得模型均显著优于适当的零模型,但该显著性在自发活动条件下出现的频率低于任务诱发活动条件。关键发现表明:当训练与测试数据采集于同一会话时,预测性能显著优于跨会话数据,这凸显了在时间维度上分离训练集与测试集采集过程的重要性,以避免数据泄漏及模型泛化评估中的乐观偏差。综上所述,本研究在证明脑电图模型能够提供具有统计显著性的功能磁共振成像预测的同时,也揭示了预测能力在脑活动自发波动及跨会话训练数据条件下会受到削弱。我们的发现强调,在日益增长的脑电图到功能磁共振成像合成研究领域中,必须明确考量这些常被忽视的关键问题。