Audio denoising is critical in signal processing, enhancing intelligibility and fidelity for applications like restoring musical recordings. This paper presents a proof-of-concept for adapting a state-of-the-art neural audio codec, the Descript Audio Codec (DAC), for music denoising. This work overcomes the limitations of traditional architectures like U-Nets by training the model on a large-scale, custom-synthesized dataset built from diverse sources. Training is guided by a multi objective loss function that combines time-domain, spectral, and signal-level fidelity metrics. Ultimately, this paper aims to present a PoC for high-fidelity, generative audio restoration.
翻译:音频降噪在信号处理中至关重要,可提升如音乐录音修复等应用的清晰度与保真度。本文提出了一种概念验证,将先进的神经音频编解码器——Descript Audio Codec(DAC)——适配用于音乐降噪。该工作通过在基于多元源构建的大规模定制合成数据集上训练模型,克服了U-Net等传统架构的局限性。训练过程由结合时域、频域及信号级保真度指标的多目标损失函数指导。最终,本文旨在为高保真、生成式音频修复提供一个概念验证。