Objective: EEG-based methods can predict speech intelligibility, but their accuracy and robustness lag behind behavioral tests, which typically show test-retest differences under 1 dB. We introduce the multi-decoder method to predict speech reception thresholds (SRTs) from EEG recordings, enabling objective assessment for populations unable to perform behavioral tests; such as those with disorders of consciousness or during hearing aid fitting. Approach: The method aggregates data from hundreds of decoders, each trained on different speech features and EEG preprocessing setups to quantify neural tracking (NT) of speech signals. Using data from 39 participants (ages 18-24), we recorded 29 minutes of EEG per person while they listened to speech at six signal-to-noise ratios and a quiet story. NT values were combined into a high-dimensional feature vector per subject, and a support vector regression model was trained to predict SRTs from these vectors. Main Result: Predictions correlated significantly with behavioral SRTs (r = 0.647, p < 0.001; NRMSE = 0.19), with all differences under 1 dB. SHAP analysis showed theta/delta bands and early lags had slightly greater influence. Using pretrained subject-independent decoders reduced required EEG data collection to 15 minutes (3 minutes of story, 12 minutes across six SNR conditions) without losing accuracy.
翻译:目的:基于脑电图(EEG)的方法可以预测言语可懂度,但其准确性和鲁棒性落后于行为测试,后者通常显示出低于1 dB的重测差异。我们引入了多解码器方法,用于从EEG记录中预测言语接收阈值(SRT),从而实现对无法进行行为测试的人群(例如意识障碍患者或在助听器验配期间)进行客观评估。方法:该方法聚合了数百个解码器的数据,每个解码器在不同的言语特征和EEG预处理设置下进行训练,以量化对言语信号的神经追踪(NT)。使用来自39名参与者(年龄18-24岁)的数据,在每人聆听六个信噪比下的言语和一个安静故事时,记录了29分钟的EEG。NT值被组合成每个受试者的高维特征向量,并训练了一个支持向量回归模型来从这些向量中预测SRT。主要结果:预测值与行为SRT显著相关(r = 0.647, p < 0.001; NRMSE = 0.19),所有差异均在1 dB以下。SHAP分析显示,theta/delta频带和早期滞后具有略大的影响。使用预训练的、与受试者无关的解码器,可将所需的EEG数据收集时间减少到15分钟(3分钟的故事,12分钟跨越六个SNR条件),且不损失准确性。