Lifelong sequence generation (LSG), a problem in continual learning, aims to continually train a model on a sequence of generation tasks to learn constantly emerging new generation patterns while avoiding the forgetting of previous knowledge. Existing LSG methods mainly focus on maintaining old knowledge while paying little attention to knowledge transfer across tasks. In contrast, humans can better learn new tasks by leveraging previously acquired knowledge from similar tasks. Inspired by the learning paradigm of humans, we propose Dynamic Module Expansion and Adaptation (DMEA), which enables the model to dynamically determine the architecture for acquiring new knowledge based on task correlation and select the most similar previous tasks to facilitate adaptation to new tasks. In addition, as the learning process can easily be biased towards the current task which might cause more severe forgetting of previously learned knowledge, we propose dynamic gradient scaling to balance the learning of the current task and replayed tasks. With extensive experiments, we demonstrate that DMEA can consistently outperform existing methods in different LSG settings.
翻译:终身序列生成(Lifelong Sequence Generation, LSG)是持续学习中的一项问题,旨在持续训练模型处理一系列生成任务,以学习不断涌现的新生成模式,同时避免遗忘先前知识。现有LSG方法主要关注维护旧知识,而很少关注任务间的知识迁移。相比之下,人类可以通过利用从相似任务中先前获得的知识来更好地学习新任务。受人类学习范式的启发,我们提出动态模块扩展与适配(Dynamic Module Expansion and Adaptation, DMEA),该方法使模型能够基于任务相关性动态确定获取新知识的架构,并选择最相似的先前任务以促进对新任务的适配。此外,由于学习过程容易偏向当前任务,可能导致更严重地遗忘先前学到的知识,我们提出动态梯度缩放以平衡当前任务与重放任务的学习。通过大量实验,我们证明DMEA在不同LSG设置中均能持续优于现有方法。