Continual Learning (CL) methods focus on accumulating knowledge over time while avoiding catastrophic forgetting. Recently, Wortsman et al. (2020) proposed a CL method, SupSup, which uses a randomly initialized, fixed base network (model) and finds a supermask for each new task that selectively keeps or removes each weight to produce a subnetwork. They prevent forgetting as the network weights are not being updated. Although there is no forgetting, the performance of SupSup is sub-optimal because fixed weights restrict its representational power. Furthermore, there is no accumulation or transfer of knowledge inside the model when new tasks are learned. Hence, we propose ExSSNeT (Exclusive Supermask SubNEtwork Training), that performs exclusive and non-overlapping subnetwork weight training. This avoids conflicting updates to the shared weights by subsequent tasks to improve performance while still preventing forgetting. Furthermore, we propose a novel KNN-based Knowledge Transfer (KKT) module that utilizes previously acquired knowledge to learn new tasks better and faster. We demonstrate that ExSSNeT outperforms strong previous methods on both NLP and Vision domains while preventing forgetting. Moreover, ExSSNeT is particularly advantageous for sparse masks that activate 2-10% of the model parameters, resulting in an average improvement of 8.3% over SupSup. Furthermore, ExSSNeT scales to a large number of tasks (100). Our code is available at https://github.com/prateeky2806/exessnet.
翻译:持续学习(CL)方法专注于随时间积累知识,同时避免灾难性遗忘。近期,Wortsman等人(2020)提出了一种CL方法SupSup,该方法利用随机初始化且固定的基础网络(模型),为每个新任务寻找一个超掩码,通过选择性保留或移除每个权重来生成子网络。由于网络权重不更新,该方法可防止遗忘。尽管无遗忘问题,但SupSup的性能欠佳,因为固定权重限制了其表征能力。此外,在学习新任务时,模型内部并未实现知识的积累或迁移。因此,我们提出ExSSNeT(Exclusive Supermask SubNEtwork Training,独占式超掩码子网络训练),该方法执行独占且无重叠的子网络权重训练,避免了后续任务对共享权重的冲突更新,从而在提升性能的同时仍能防止遗忘。此外,我们提出了一种基于KNN的知识迁移(KKT)模块,利用先前获取的知识更优、更快地学习新任务。实验表明,ExSSNeT在自然语言处理(NLP)和计算机视觉领域均优于此前强基线方法,同时有效防止遗忘。值得一提的是,ExSSNeT特别适用于激活2-10%模型参数的稀疏掩码,相比SupSup平均提升8.3%。此外,ExSSNeT可扩展至大量任务(100个)。我们的代码已开源:https://github.com/prateeky2806/exessnet。