Low-resource speech recognition has been long-suffering from insufficient training data. In this paper, we propose an approach that leverages neighboring languages to improve low-resource scenario performance, founded on the hypothesis that similar linguistic units in neighboring languages exhibit comparable term frequency distributions, which enables us to construct a Huffman tree for performing multilingual hierarchical Softmax decoding. This hierarchical structure enables cross-lingual knowledge sharing among similar tokens, thereby enhancing low-resource training outcomes. Empirical analyses demonstrate that our method is effective in improving the accuracy and efficiency of low-resource speech recognition.
翻译:低资源语音识别长期受限于训练数据不足。本文提出一种利用邻近语言提升低资源场景性能的方法,其核心假设是:邻近语言中相似语言单元具有相近的词频分布特性,据此可构建霍夫曼树以实现多语言层次化Softmax解码。这种层次化结构促进了相似词元间的跨语言知识共享,从而优化低资源训练效果。实验分析表明,本方法能有效提升低资源语音识别的准确率与效率。