In this paper, we are comparing several methods of training the Slovak speech recognition models based on the Transformers architecture. Specifically, we are exploring the approach of transfer learning from the existing Czech pre-trained Wav2Vec 2.0 model into Slovak. We are demonstrating the benefits of the proposed approach on three Slovak datasets. Our Slovak models scored the best results when initializing the weights from the Czech model at the beginning of the pre-training phase. Our results show that the knowledge stored in the Cezch pre-trained model can be successfully reused to solve tasks in Slovak while outperforming even much larger public multilingual models.
翻译:本文比较了基于Transformer架构的斯洛伐克语语音识别模型的几种训练方法。具体而言,我们探索了从现有捷克语预训练Wav2Vec 2.0模型向斯洛伐克语进行迁移学习的方法。我们在三个斯洛伐克数据集上展示了所提出方法的优势。在预训练阶段开始前使用捷克语模型初始化权重时,我们的斯洛伐克语模型取得了最佳结果。实验表明,捷克语预训练模型中存储的知识可成功复用于斯洛伐克语任务,其性能甚至优于体量更大的公开多语言模型。