Neural network pruning offers an effective method for compressing a multilingual automatic speech recognition (ASR) model with minimal performance loss. However, it entails several rounds of pruning and re-training needed to be run for each language. In this work, we propose the use of an adaptive masking approach in two scenarios for pruning a multilingual ASR model efficiently, each resulting in sparse monolingual models or a sparse multilingual model (named as Dynamic ASR Pathways). Our approach dynamically adapts the sub-network, avoiding premature decisions about a fixed sub-network structure. We show that our approach outperforms existing pruning methods when targeting sparse monolingual models. Further, we illustrate that Dynamic ASR Pathways jointly discovers and trains better sub-networks (pathways) of a single multilingual model by adapting from different sub-network initializations, thereby reducing the need for language-specific pruning.
翻译:神经网络剪枝提供了一种有效压缩多语言自动语音识别(ASR)模型且性能损失最小的方法。然而,该方法需要针对每种语言进行多轮剪枝和重新训练。本文提出在两种场景下使用自适应掩码方法高效剪枝多语言ASR模型,分别生成稀疏单语言模型或稀疏多语言模型(称为动态ASR路径)。我们的方法动态调整子网络,避免了对固定子网络结构的过早决策。研究表明,在生成稀疏单语言模型时,我们的方法优于现有剪枝方法。此外,我们证明动态ASR路径通过从不同子网络初始化中自适应调整,能够联合发现并训练单个多语言模型中更优的子网络(路径),从而减少了对特定语言剪枝的需求。