We address the problem of accurately interpolating measured anechoic steering vectors with a deep learning framework called the neural field. This task plays a pivotal role in reducing the resource-intensive measurements required for precise sound source separation and localization, essential as the front-end of speech recognition. Classical approaches to interpolation rely on linear weighting of nearby measurements in space on a fixed, discrete set of frequencies. Drawing inspiration from the success of neural fields for novel view synthesis in computer vision, we introduce the neural steerer, a continuous complex-valued function that takes both frequency and direction as input and produces the corresponding steering vector. Importantly, it incorporates inter-channel phase difference information and a regularization term enforcing filter causality, essential for accurate steering vector modeling. Our experiments, conducted using a dataset of real measured steering vectors, demonstrate the effectiveness of our resolution-free model in interpolating such measurements.
翻译:我们提出了一种利用神经场深度学习框架精确插值实测消声导向矢量的方法。该任务在降低实现精确声源分离与定位所需的高强度测量代价方面具有关键作用,是语音识别前端处理的核心环节。经典插值方法依赖固定离散频率集合上邻近空间测量的线性加权。受计算机视觉领域神经场在新型视角合成中的成功启发,我们引入神经导向器——一种以频率和方向为输入、输出对应导向矢量的连续复值函数。其关键创新在于融入了通道间相位差信息与强制滤波器因果性的正则化项,这对精确导向矢量建模至关重要。通过在真实测量导向矢量数据集上的实验,我们验证了这种无分辨率模型在插值此类测量数据时的有效性。