Decoding the attended speaker in a multi-speaker environment from electroencephalography (EEG) has attracted growing interest in recent years, with neuro-steered hearing devices as a driver application. Current approaches typically rely on ground-truth labels of the attended speaker during training, necessitating calibration sessions for each user and each EEG set-up to achieve optimal performance. While unsupervised self-adaptive auditory attention decoding (AAD) for stimulus reconstruction has been developed to eliminate the need for labeled data, it suffers from an initialization bias that can compromise performance. Although an unbiased variant has been proposed to address this limitation, it introduces substantial computational complexity that scales with data size. This paper presents three computationally efficient alternatives that achieve comparable performance, but with a significantly lower and constant computational cost. The code for the proposed algorithms is available at https://github.com/YYao-42/Unsupervised_AAD.
翻译:近年来,从脑电图(EEG)中解码多说话者环境中的受关注说话者引起了越来越多的兴趣,其中神经引导的听觉设备是一个驱动应用。当前方法通常在训练期间依赖于受关注说话者的真实标签,这需要为每个用户和每种EEG设置进行校准会话以实现最佳性能。虽然已经开发了用于刺激重建的无监督自适应听觉注意解码(AAD)以消除对标记数据的需求,但它存在初始化偏差,可能损害性能。尽管已提出一种无偏变体来解决这一限制,但它引入了随数据规模增长的大量计算复杂度。本文提出了三种计算高效且性能相当的替代方案,其计算成本显著降低且保持恒定。所提算法的代码可在 https://github.com/YYao-42/Unsupervised_AAD 获取。