Recently, excellent progress has been made in speech recognition. However, pure data-driven approaches have struggled to solve the problem in domain-mismatch and long-tailed data. Considering that knowledge-driven approaches can help data-driven approaches alleviate their flaws, we introduce sememe-based semantic knowledge information to speech recognition (SememeASR). Sememe, according to the linguistic definition, is the minimum semantic unit in a language and is able to represent the implicit semantic information behind each word very well. Our experiments show that the introduction of sememe information can improve the effectiveness of speech recognition. In addition, our further experiments show that sememe knowledge can improve the model's recognition of long-tailed data and enhance the model's domain generalization ability.
翻译:近年来,语音识别领域取得了显著进展。然而,纯数据驱动方法难以解决领域不匹配和长尾数据问题。考虑到知识驱动方法可辅助数据驱动方法弥补其不足,我们将基于义原的语义知识信息引入语音识别(SememeASR)。根据语言学定义,义原是语言中最小的语义单元,能够很好地表征每个词背后的隐含语义信息。实验表明,引入义原信息可提升语音识别的有效性。此外,进一步实验显示,义原知识能提升模型对长尾数据的识别能力,并增强模型的领域泛化能力。