We propose a neural language modeling system based on low-rank adaptation (LoRA) for speech recognition output rescoring. Although pretrained language models (LMs) like BERT have shown superior performance in second-pass rescoring, the high computational cost of scaling up the pretraining stage and adapting the pretrained models to specific domains limit their practical use in rescoring. Here we present a method based on low-rank decomposition to train a rescoring BERT model and adapt it to new domains using only a fraction (0.08%) of the pretrained parameters. These inserted matrices are optimized through a discriminative training objective along with a correlation-based regularization loss. The proposed low-rank adaptation Rescore-BERT (LoRB) architecture is evaluated on LibriSpeech and internal datasets with decreased training times by factors between 5.4 and 3.6.
翻译:我们提出了一种基于低秩适配(LoRA)的神经语言建模系统,用于语音识别输出的重打分。尽管像BERT这样的预训练语言模型在第二遍重打分中展现出卓越性能,但扩展预训练阶段的高计算成本以及将预训练模型适配到特定领域的过程,限制了它们在重打分中的实际应用。本文提出了一种基于低秩分解的方法,仅使用预训练参数的一小部分(0.08%)即可训练一个用于重打分的BERT模型,并将其适配到新领域。这些插入的矩阵通过判别性训练目标以及基于相关性的正则化损失进行优化。所提出的低秩适配重打分BERT(LoRB)架构在LibriSpeech和内部数据集上进行了评估,训练时间分别减少了5.4倍和3.6倍。