We propose an unsupervised approach for training separation models from scratch using RemixIT and Self-Remixing, which are recently proposed self-supervised learning methods for refining pre-trained models. They first separate mixtures with a teacher model and create pseudo-mixtures by shuffling and remixing the separated signals. A student model is then trained to separate the pseudo-mixtures using either the teacher's outputs or the initial mixtures as supervision. To refine the teacher's outputs, the teacher's weights are updated with the student's weights. While these methods originally assumed that the teacher is pre-trained, we show that they are capable of training models from scratch. We also introduce a simple remixing method to stabilize training. Experimental results demonstrate that the proposed approach outperforms mixture invariant training, which is currently the only available approach for training a monaural separation model from scratch.
翻译:我们提出了一种无监督方法,用于从零训练分离模型。该方法采用近期提出的自监督学习方法RemixIT和Self-Remixing,此前这些方法主要用于优化预训练模型。其流程为:首先,教师模型对混合信号进行分离,通过混洗和重混分离后的信号生成伪混合数据;随后,以教师输出或原始混合信号作为监督信号,训练学生模型对伪混合数据进行分离。为优化教师输出,教师权重会随着学生权重的更新而迭代更新。尽管这些方法最初假设教师模型是预训练的,但我们证明它们同样具备从零训练模型的能力。此外,我们引入了一种简单的重混方法以稳定训练过程。实验结果表明,所提方法优于混合不变训练——当前唯一可用的从零训练单声道分离模型的方法。