Recent approaches in source separation leverage semantic information about their input mixtures and constituent sources that when used in conditional separation models can achieve impressive performance. Most approaches along these lines have focused on simple descriptions, which are not always useful for varying types of input mixtures. In this work, we present an approach in which a model, given an input mixture and partial semantic information about a target source, is trained to extract additional semantic data. We then leverage this pre-trained model to improve the separation performance of an uncoupled multi-conditional separation network. Our experiments demonstrate that the separation performance of this multi-conditional model is significantly improved, approaching the performance of an oracle model with complete semantic information. Furthermore, our approach achieves performance levels that are comparable to those of the best performing specialized single conditional models, thus providing an easier to use alternative.
翻译:近期源分离方法通过利用输入混合信号及其构成源的语义信息,在条件分离模型中实现了卓越性能。现有相关研究主要聚焦于简单描述,这类描述对于不同类型的输入混合信号并不总能提供有效信息。本文提出一种新方法:在输入混合信号与目标源的部分语义信息条件下,训练模型提取额外语义数据。进而利用该预训练模型提升非耦合多条件分离网络的分离性能。实验表明,该多条件模型的分离性能获得显著提升,趋近于具备完整语义信息的理想模型性能。此外,本方法达到了与最优专用单条件模型相当的分离水平,从而提供了更易使用的替代方案。