Auditory spatial attention detection (ASAD) is used to determine the direction of a listener's attention to a speaker by analyzing her/his electroencephalographic (EEG) signals. This study aimed to further improve the performance of ASAD with a short decision window (i.e., <1 s) rather than with long decision windows in previous studies. An end-to-end temporal attention network (i.e., TAnet) was introduced in this work. TAnet employs a multi-head attention (MHA) mechanism, which can more effectively capture the interactions among time steps in collected EEG signals and efficiently assign corresponding weights to those EEG time steps. Experiments demonstrated that, compared with the CNN-based method and recent ASAD methods, TAnet provided improved decoding performance in the KUL dataset, with decoding accuracies of 92.4% (decision window 0.1 s), 94.9% (0.25 s), 95.1% (0.3 s), 95.4% (0.4 s), and 95.5% (0.5 s) with short decision windows (i.e., <1 s). As a new ASAD model with a short decision window, TAnet can potentially facilitate the design of EEG-controlled intelligent hearing aids and sound recognition systems.
翻译:听觉空间注意力检测(ASAD)用于通过分析个体的脑电图信号来判断其对说话者的注意力方向。本研究旨在进一步优化短决策窗口(即<1秒,相较于以往研究中的长决策窗口)下的ASAD性能。本文提出了一种端到端的时间注意力网络(即TAnet)。TAnet采用多头注意力机制,能够更有效地捕获采集到的脑电信号中时间步之间的交互关系,并为这些脑电时间步高效分配相应权重。实验结果表明,与基于CNN的方法及近期ASAD方法相比,TAnet在KUL数据集中展现出更优的解码性能:在短决策窗口(即<1秒)下,解码准确率分别达到92.4%(决策窗口0.1秒)、94.9%(0.25秒)、95.1%(0.3秒)、95.4%(0.4秒)和95.5%(0.5秒)。作为具有短决策窗口的新型ASAD模型,TAnet有望为脑电控制型智能助听器及声音识别系统的设计提供有力支持。