Several recent contributions in the field of iterative STFT phase retrieval have demonstrated that the performance of the classical Griffin-Lim method can be considerably improved upon. By using the same projection operators as Griffin-Lim, but combining them in innovative ways, these approaches achieve better results in terms of both reconstruction quality and required number of iterations, while retaining a similar computational complexity per iteration. However, like Griffin-Lim, these algorithms operate in an offline manner and thus require an entire spectrogram as input, which is an unrealistic requirement for many real-world speech communication applications. We propose to extend RTISI -- an existing online (frame-by-frame) variant of the Griffin-Lim algorithm -- into a flexible framework that enables straightforward online implementation of any algorithm based on iterative projections. We further employ this framework to implement online variants of the fast Griffin-Lim algorithm, the accelerated Griffin-Lim algorithm, and two algorithms from the optics domain. Evaluation results on speech signals show that, similarly to the offline case, these algorithms can achieve a considerable performance gain compared to RTISI.
翻译:近年来,迭代式STFT相位恢复领域的一些研究表明,经典Griffin-Lim方法的性能可以得到显著提升。这些方法采用与Griffin-Lim相同的投影算子,但通过创新性的组合方式,在保持每次迭代计算复杂度相近的同时,在重构质量和所需迭代次数方面取得了更优的结果。然而,与Griffin-Lim一样,这些算法以离线方式运行,因此需要完整的频谱图作为输入,这对于许多实际语音通信应用而言是不现实的。我们提出将RTISI(一种现有的Griffin-Lim算法在线(逐帧)变体)扩展为一个灵活框架,该框架能够实现基于迭代投影的任何算法的直接在线实现。我们进一步利用该框架实现了快速Griffin-Lim算法、加速Griffin-Lim算法以及两个光学领域算法的在线变体。在语音信号上的评估结果表明,与离线情况类似,这些算法相比RTISI能够获得显著的性能提升。