The so-called audio inpainting problem in the time domain refers to estimating missing segments of samples within a signal. Over the years, several methods have been developed for such type of audio inpainting. In contrast to this case, a time-frequency variant of inpainting appeared in the literature, where the challenge is to reconstruct missing spectrogram columns with reliable information. We propose a method to address this time-frequency audio inpainting problem. Our approach is based on the recently introduced phase-aware signal prior that exploits an estimate of the instantaneous frequency. An optimization problem is formulated and solved using the generalized Chambolle-Pock algorithm. The proposed method is evaluated both objectively and subjectively against other time-frequency inpainting methods, specifically a deep-prior neural network and the autoregression-based approach known as Janssen-TF. Our proposed approach surpassed these methods in the objective evaluation as well as in the conducted listening test. Moreover, this outcome is achieved with a substantially reduced computational requirement compared to alternative methods.
翻译:时域中的所谓音频修复问题,指的是估计信号内缺失的样本段。多年来,针对此类音频修复已发展出多种方法。与此相对,文献中出现了时频域的修复变体,其挑战在于利用可靠信息重建缺失的频谱图列。本文提出一种方法来解决这一时频音频修复问题。我们的方法基于最近引入的相位感知信号先验,该先验利用了瞬时频率的估计。我们构建了一个优化问题,并使用广义Chambolle-Pock算法进行求解。所提出的方法通过客观和主观评估,与其他时频修复方法(特别是深度先验神经网络以及基于自回归的Janssen-TF方法)进行了比较。在客观评估和所进行的听音测试中,我们提出的方法均超越了这些方法。此外,与替代方法相比,该方法在计算需求上显著降低。