Lifelong sequence generation (LSG), a problem in continual learning, aims to continually train a model on a sequence of generation tasks to learn constantly emerging new generation patterns while avoiding the forgetting of previous knowledge. Existing LSG methods mainly focus on maintaining old knowledge while paying little attention to knowledge transfer across tasks. In contrast, humans can better learn new tasks by leveraging previously acquired knowledge from similar tasks. Inspired by the learning paradigm of humans, we propose Dynamic Module Expansion and Adaptation (DMEA), which enables the model to dynamically determine the architecture for acquiring new knowledge based on task correlation and select the most similar previous tasks to facilitate adaptation to new tasks. In addition, as the learning process can easily be biased towards the current task which might cause more severe forgetting of previously learned knowledge, we propose dynamic gradient scaling to balance the learning of the current task and replayed tasks. With extensive experiments, we demonstrate that DMEA can consistently outperform existing methods in different LSG settings.
翻译:摘要:终身序列生成(LSG)作为持续学习中的一项问题,旨在通过持续训练模型处理一系列生成任务,以不断学习新兴的生成模式,同时避免遗忘先前知识。现有LSG方法主要侧重于维持旧知识,却鲜少关注任务间的知识迁移。相比之下,人类能通过利用先前从相似任务中获取的知识,更高效地学习新任务。受人类学习范式的启发,我们提出动态模块扩展与适应(DMEA),该方法使模型能基于任务相关性动态确定获取新知识的架构,并选择最相似的先前任务以促进新任务的适应。此外,由于学习过程易偏向当前任务,可能导致更严重的先前知识遗忘,我们提出动态梯度缩放以平衡当前任务与重放任务的学习。通过大量实验,我们证明DMEA在不同LSG设置下均能持续优于现有方法。