The so-called independent low-rank matrix analysis (ILRMA) has demonstrated a great potential for dealing with the problem of determined blind source separation (BSS) for audio and speech signals. This method assumes that the spectra from different frequency bands are independent and the spectral coefficients in any frequency band are Gaussian distributed. The Itakura-Saito divergence is then employed to estimate the source model related parameters. In reality, however, the spectral coefficients from different frequency bands may be dependent, which is not considered in the existing ILRMA algorithm. This paper presents an improved version of ILRMA, which considers the dependency between the spectral coefficients from different frequency bands. The Sinkhorn divergence is then exploited to optimize the source model parameters. As a result of using the cross-band information, the BSS performance is improved. But the number of parameters to be estimated also increases significantly, and so is the computational complexity. To reduce the algorithm complexity, we apply the Kronecker product to decompose the modeling matrix into the product of a number of matrices of much smaller dimensionality. An efficient algorithm is then developed to implement the Sinkhorn divergence based BSS algorithm and the complexity is reduced by an order of magnitude.
翻译:所谓的独立低秩矩阵分析(ILRMA)在处理音频和语音信号的确定性盲源分离(BSS)问题中展现了巨大潜力。该方法假设不同频段的频谱相互独立,且任意频段的频谱系数服从高斯分布,并采用Itakura-Saito散度估计源模型相关参数。然而,实际情况下不同频段的频谱系数可能存在依赖性,而现有ILRMA算法未考虑这一特性。本文提出改进版ILRMA,通过引入跨频段频谱系数的依赖关系,利用Sinkhorn散度优化源模型参数。由于利用跨频段信息,BSS性能得以提升,但待估计参数数量及计算复杂度也显著增加。为降低算法复杂度,我们采用Kronecker积将建模矩阵分解为多个低维矩阵的乘积,进而开发出基于Sinkhorn散度的高效BSS算法,使计算复杂度降低一个数量级。