Steered Response Power (SRP) is a widely used method for the task of sound source localization using microphone arrays, showing satisfactory localization performance on many practical scenarios. However, its performance is diminished under highly reverberant environments. Although Deep Neural Networks (DNNs) have been previously proposed to overcome this limitation, most are trained for a specific number of microphones with fixed spatial coordinates. This restricts their practical application on scenarios frequently observed in wireless acoustic sensor networks, where each application has an ad-hoc microphone topology. We propose Neural-SRP, a DNN which combines the flexibility of SRP with the performance gains of DNNs. We train our network using simulated data and transfer learning, and evaluate our approach on recorded and simulated data. Results verify that Neural-SRP's localization performance significantly outperforms the baselines.
翻译:导向响应功率(Steered Response Power, SRP)是一种广泛用于麦克风阵列声源定位任务的方法,在众多实际场景中展现出令人满意的定位性能。然而,在强混响环境下其性能会显著下降。尽管深度神经网络(DNNs)曾被提出以克服这一局限性,但大多数网络针对具有固定空间坐标的特定麦克风数量进行训练,这限制了其在无线声学传感器网络中常见的临时麦克风拓扑场景下的实际应用。我们提出了Neural-SRP——一种结合SRP灵活性与DNN性能优势的深度神经网络。通过使用仿真数据与迁移学习训练网络,并在录制数据与仿真数据上评估方法,实验结果验证了Neural-SRP的定位性能显著优于基准方法。