In this work, we study the problem of continual learning (CL) where the goal is to learn a model on a sequence of tasks, such that the data from the previous tasks becomes unavailable while learning on the current task data. CL is essentially a balancing act between being able to learn on the new task (i.e., plasticity) and maintaining the performance on the previously learned concepts (i.e., stability). Intending to address the stability-plasticity trade-off, we propose to perform weight-ensembling of the model parameters of the previous and current tasks. This weighted-ensembled model, which we call Continual Model Averaging (or CoMA), attains high accuracy on the current task by leveraging plasticity, while not deviating too far from the previous weight configuration, ensuring stability. We also propose an improved variant of CoMA, named Continual Fisher-weighted Model Averaging (or CoFiMA), that selectively weighs each parameter in the weights ensemble by leveraging the Fisher information of the weights of the model. Both variants are conceptually simple, easy to implement, and effective in attaining state-of-the-art performance on several standard CL benchmarks. Code is available at: https://github.com/IemProg/CoFiMA.
翻译:在本工作中,我们研究持续学习问题,其目标是在一系列任务上学习模型,且在学习当前任务数据时无法获取先前任务的数据。持续学习本质上是在学习新任务的能力(即可塑性)与保持已学概念性能(即稳定性)之间寻求平衡。为应对稳定性与可塑性的权衡,我们提出对先前任务与当前任务的模型参数进行加权集成。这种加权集成模型——我们称之为持续模型平均(CoMA)——通过利用可塑性在当前任务上获得高精度,同时不会过度偏离先前的权重配置,从而确保稳定性。我们还提出CoMA的改进变体,即持续费希尔加权模型平均(CoFiMA),该变体通过利用模型权重的费希尔信息,对权重集成中的每个参数进行选择性加权。两种变体均概念简洁、易于实现,并在多个标准持续学习基准测试中取得了最先进的性能。代码发布于:https://github.com/IemProg/CoFiMA。