A multi-task learning framework is proposed for optimizing a single deep neural network (DNN) for joint noise reduction (NR) and hearing loss compensation (HLC). A distinct training objective is defined for each task, and the DNN predicts two time-frequency masks. During inference, the amounts of NR and HLC can be adjusted independently by exponentiating each mask before combining them. In contrast to recent approaches that rely on training an auditory-model emulator to define a differentiable training objective, we propose an auditory model that is inherently differentiable, thus allowing end-to-end optimization. The audiogram is provided as an input to the DNN, thereby enabling listener-specific personalization without the need for retraining. Results show that the proposed approach not only allows adjusting the amounts of NR and HLC individually, but also improves objective metrics compared to optimizing a single training objective. It also outperforms a cascade of two DNNs that were separately trained for NR and HLC, and shows competitive HLC performance compared to a traditional hearing-aid prescription. To the best of our knowledge, this is the first study that uses an auditory model to train a single DNN for both NR and HLC across a wide range of listener profiles.
翻译:提出了一种多任务学习框架,用于优化单个深度神经网络(DNN)同时实现降噪(NR)与听力损失补偿(HLC)。每个任务均定义了独立的训练目标,DNN预测两个时频掩膜。在推理阶段,通过对各掩膜进行幂运算后再组合,可独立调节NR和HLC的程度。与近期依赖训练听觉模型仿真器来定义可微训练目标的方法不同,本文提出的听觉模型本身具备可微性,从而实现端到端优化。将听力图作为DNN输入,无需重新训练即可实现针对特定听者的个性化配置。结果表明,该方法不仅能独立调节NR和HLC的程度,而且相较于优化单一训练目标,在各客观指标上均有提升。该方法还优于分别针对NR和HLC单独训练的两个DNN级联模型,并在HLC性能上与传统助听器处方方案相当。据我们所知,这是首个利用听觉模型训练单个DNN,在广泛听者特征范围内同时实现NR和HLC的研究。