We propose a novel approach for the automatic equalization of individual musical instrument tracks. Our method begins by identifying the instrument present within a source recording in order to choose its corresponding ideal spectrum as a target. Next, the spectral difference between the recording and the target is calculated, and accordingly, an equalizer matching model is used to predict settings for a parametric equalizer. To this end, we build upon a differentiable parametric equalizer matching neural network, demonstrating improvements relative to previously established state-of-the-art. Unlike past approaches, we show how our system naturally allows real-world audio data to be leveraged during the training of our matching model, effectively generating suitably produced training targets in an automated manner mirroring conditions at inference time. Consequently, we illustrate how fine-tuning our matching model on such examples considerably improves parametric equalizer matching performance in real-world scenarios, decreasing mean absolute error by 24% relative to methods relying solely on random parameter sampling techniques as a self-supervised learning strategy. We perform listening tests, and demonstrate that our proposed automatic equalization solution subjectively enhances the tonal characteristics for recordings of common instrument types.
翻译:本文提出了一种用于单乐器音轨自动均衡的新方法。该方法首先识别源录音中的乐器类型,以选择其对应的理想频谱作为目标。随后计算录音与目标之间的频谱差异,并据此使用均衡器匹配模型来预测参数均衡器的设置。为此,我们在可微分参数均衡器匹配神经网络的基础上进行改进,相较于现有最优方法取得了性能提升。与以往方法不同,我们展示了本系统如何在匹配模型训练过程中自然利用真实音频数据,通过模拟推理时的条件自动生成适配的制作训练目标。实验表明,在此类样本上对匹配模型进行微调可显著提升参数均衡器在真实场景中的匹配性能,相较于仅采用随机参数采样技术作为自监督学习策略的方法,平均绝对误差降低了24%。我们进行了听感测试,证明所提出的自动均衡方案在主观上改善了常见乐器类型录音的音色特性。