Capturing audio signals with specific directivity patterns is essential in speech communication. This study presents a deep neural network (DNN)-based approach to directional filtering, alleviating the need for explicit signal models. More specifically, our proposed method uses a DNN to estimate a single-channel complex mask from the signals of a microphone array. This mask is then applied to a reference microphone to render a signal that exhibits a desired directivity pattern. We investigate the training dataset composition and its effect on the directivity realized by the DNN during inference. Using a relatively small DNN, the proposed method is found to approximate the desired directivity pattern closely. Additionally, it allows for the realization of higher-order directivity patterns using a small number of microphones, which is a difficult task for linear and parametric directional filtering.
翻译:在语音通信中,采集具有特定指向性模式的音频信号至关重要。本研究提出了一种基于深度神经网络(DNN)的定向滤波方法,减少了对显式信号模型的依赖。具体而言,我们提出的方法使用DNN从麦克风阵列的信号中估计单通道复数掩码。然后将该掩码应用于参考麦克风,以生成呈现所需指向性模式的信号。我们研究了训练数据集的构成及其对DNN在推理过程中实现的指向性的影响。使用相对较小的DNN,所提方法能够紧密逼近所需的指向性模式。此外,它允许使用少量麦克风实现高阶指向性模式,这对于线性和参数化定向滤波而言是一项困难的任务。