Lifelong sequence generation (LSG), a problem in continual learning, aims to continually train a model on a sequence of generation tasks to learn constantly emerging new generation patterns while avoiding the forgetting of previous knowledge. Existing LSG methods mainly focus on maintaining old knowledge while paying little attention to knowledge transfer across tasks. In contrast, humans can better learn new tasks by leveraging previously acquired knowledge from similar tasks. Inspired by the learning paradigm of humans, we propose Dynamic Module Expansion and Adaptation (DMEA), which enables the model to dynamically determine the architecture for acquiring new knowledge based on task correlation and select the most similar previous tasks to facilitate adaptation to new tasks. In addition, as the learning process can easily be biased towards the current task which might cause more severe forgetting of previously learned knowledge, we propose dynamic gradient scaling to balance the learning of the current task and replayed tasks. With extensive experiments, we demonstrate that DMEA can consistently outperform existing methods in different LSG settings.
翻译:终身序列生成(LSG)是持续学习中的一个问题,旨在持续训练模型处理一系列生成任务,以不断学习新出现的生成模式,同时避免遗忘先前知识。现有LSG方法主要关注维持旧知识,但很少关注任务间的知识迁移。相比之下,人类可以通过利用从类似任务中获得的先前知识来更好地学习新任务。受人类学习范式的启发,我们提出了动态模块扩展与适应(DMEA),该方法使模型能够根据任务相关性动态确定获取新知识的架构,并选择最相似的先前任务来促进对新任务的适应。此外,由于学习过程容易偏向当前任务,可能导致更严重地遗忘先前学到的知识,我们提出了动态梯度缩放方法,以平衡当前任务与重放任务的学习。通过大量实验,我们证明DMEA在不同LSG设置中均能持续优于现有方法。