Speaker localization for binaural microphone arrays has been widely studied for applications such as speech communication, video conferencing, and robot audition. Many methods developed for this task, including the direct path dominance (DPD) test, share common stages in their processing, which include transformation using the short-time Fourier transform (STFT), and a direction of arrival (DOA) search that is based on the head related transfer function (HRTF) set. In this paper, alternatives to these processing stages, motivated by human hearing, are proposed. These include incorporating an auditory filter bank to replace the STFT, and a new DOA search based on transformed HRTF as steering vectors. A simulation study and an experimental study are conducted to validate the proposed alternatives, and both are applied to two binaural DOA estimation methods; the results show that the proposed method compares favorably with current methods.
翻译:双耳麦克风阵列的说话人定位在语音通信、视频会议及机器人听觉等领域得到了广泛研究。许多为此任务开发的方法(包括直达路径占优测试)共享处理流程中的共同阶段,包括使用短时傅里叶变换进行信号转换,以及基于头相关传输函数集进行到达方向搜索。本文提出受人类听觉启发而设计的这些处理阶段的替代方案,包括引入听觉滤波器组替代短时傅里叶变换,以及基于变换后的头相关传输函数作为导向矢量的新到达方向搜索方法。通过仿真研究和实验研究验证所提出的替代方案,并将其应用于两种双耳到达方向估计方法;结果表明,所提方法优于现有方法。