This paper proposes a simple but highly efficient expansion-based model for continual learning. The recent feature transformation, masking and factorization-based methods are efficient, but they grow the model only over the global or shared parameter. Therefore, these approaches do not fully utilize the previously learned information because the same task-specific parameter forgets the earlier knowledge. Thus, these approaches show limited transfer learning ability. Moreover, most of these models have constant parameter growth for all tasks, irrespective of the task complexity. Our work proposes a simple filter and channel expansion based method that grows the model over the previous task parameters and not just over the global parameter. Therefore, it fully utilizes all the previously learned information without forgetting, which results in better knowledge transfer. The growth rate in our proposed model is a function of task complexity; therefore for a simple task, the model has a smaller parameter growth while for complex tasks, the model requires more parameters to adapt to the current task. Recent expansion based models show promising results for task incremental learning (TIL). However, for class incremental learning (CIL), prediction of task id is a crucial challenge; hence, their results degrade rapidly as the number of tasks increase. In this work, we propose a robust task prediction method that leverages entropy weighted data augmentations and the models gradient using pseudo labels. We evaluate our model on various datasets and architectures in the TIL, CIL and generative continual learning settings. The proposed approach shows state-of-the-art results in all these settings. Our extensive ablation studies show the efficacy of the proposed components.
翻译:本文提出一种简单但高效的基于扩展的持续学习模型。近年来基于特征变换、掩码与分解的方法虽然高效,但仅能在全局或共享参数上扩展模型。由于相同的任务特定参数会遗忘先前知识,这些方法未能充分利用已学信息,因此迁移学习能力有限。此外,多数模型对所有任务采用恒定的参数增长,而不考虑任务复杂度。本工作提出一种基于滤波器和通道扩展的简单方法,使模型在前一任务参数(而非仅全局参数)上扩展,从而在不遗忘的前提下充分利用所有已学信息,实现更优的知识迁移。我们提出的模型增长率随任务复杂度动态调整:简单任务参数增长较小,复杂任务则需要更多参数以适应当前任务。现有基于扩展的模型在任务增量学习(TIL)中表现优异,但在类增量学习(CIL)中,任务ID预测是核心挑战,因此随着任务数量增加其性能急剧下降。本研究提出一种鲁棒的任务预测方法,通过熵加权数据增强与伪标签引导的模型梯度实现高效推断。我们在TIL、CIL及生成式持续学习设置下,基于多种数据集与架构评估模型性能,该方法在所有设置中均达到当前最优结果。广泛的消融实验验证了所提各组件的有效性。