Fine-tuning an Automatic Speech Recognition (ASR) model to new domains results in degradation on original domains, referred to as Catastrophic Forgetting (CF). Continual Learning (CL) attempts to train ASR models without suffering from CF. While in ASR, offline CL is usually considered, online CL is a more realistic but also more challenging scenario where the model, unlike in offline CL, does not know when a task boundary occurs. Rehearsal-based methods, which store previously seen utterances in a memory, are often considered for online CL, in ASR and other research domains. However, recent research has shown that weight averaging is an effective method for offline CL in ASR. Based on this result, we propose, in this paper, a rehearsal-free method applicable for online CL. Our method outperforms all baselines, including rehearsal-based methods, in two experiments. Our method is a next step towards general CL for ASR, which should enable CL in all scenarios with few if any constraints.
翻译:微调自动语音识别(ASR)模型以适应新领域会导致在原始领域上的性能退化,这一现象被称为灾难性遗忘(CF)。持续学习(CL)旨在训练ASR模型而不遭受CF的影响。在ASR中,离线CL通常被考虑,但在线CL是一个更现实但也更具挑战性的场景,与离线CL不同,模型不知道任务边界何时出现。基于记忆的方法(即将先前看到的语音片段存储在内存中)通常被用于ASR及其他研究领域的在线CL。然而,近期研究表明,权重平均是ASR中离线CL的有效方法。基于这一结果,本文提出了一种适用于在线CL的无记忆方法。在两个实验中,我们的方法优于所有基线方法,包括基于记忆的方法。我们的方法朝着ASR通用CL迈出了下一步,应能在几乎没有约束的情况下实现所有场景中的CL。