In the scenario of class-incremental learning (CIL), deep neural networks have to adapt their model parameters to non-stationary data distributions, e.g., the emergence of new classes over time. However, CIL models are challenged by the well-known catastrophic forgetting phenomenon. Typical methods such as rehearsal-based ones rely on storing exemplars of old classes to mitigate catastrophic forgetting, which limits real-world applications considering memory resources and privacy issues. In this paper, we propose a novel rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks. Our approach involves jointly optimizing a plastic CNN feature extractor and an analytical feed-forward classifier. The inaccessibility of historical data is tackled by holistically controlling the parameters of a well-trained model, ensuring that the decision boundary learned fits new classes while retaining recognition of previously learned classes. Specifically, the trainable CNN feature extractor provides task-dependent knowledge separately without interference; and the final classifier integrates task-specific knowledge incrementally for decision-making without forgetting. In each CIL session, it accommodates new tasks by attaching a tiny set of declarative parameters to its backbone, in which only one matrix per task or one vector per class is kept for knowledge retention. Extensive experiments on a variety of task sequences show that our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order robustness. Furthermore, to make the non-growing backbone (i.e., a model with limited network capacity) suffice to train on more incoming tasks, a graceful forgetting implementation on previously learned trivial tasks is empirically investigated.
翻译:在类增量学习(CIL)场景中,深度神经网络需调整其模型参数以适应非平稳数据分布(例如,新类别随时间不断涌现)。然而,CIL模型面临著名的灾难性遗忘问题。传统方法(如基于记忆重放的方法)依赖存储旧类样本来缓解灾难性遗忘,但受限于内存资源与隐私问题,难以应用于实际场景。本文提出一种无需记忆重放的新型CIL方法,通过两个互补学习子网络的协同作用实现持续学习。该方法联合优化一个可塑的CNN特征提取器与一个解析前馈分类器。通过整体控制已训练模型的参数,确保学得的决策边界既能拟合新类别,又能保留对先前学习类别的识别能力,从而解决历史数据不可获取问题。具体而言:可训练的CNN特征提取器无干扰地分别提供任务依赖知识;最终分类器以增量方式整合任务特定知识进行决策,避免遗忘。在每个CIL学习阶段,该方法通过向骨干网络附加少量声明式参数(每任务仅保留一个矩阵或每类仅保留一个向量)来容纳新任务。大量跨任务序列实验表明,本方法在准确率增益、内存开销、训练效率及任务顺序鲁棒性方面均达到与当前最优方法相竞争的性能。此外,为验证非增长型骨干网络(即容量受限的模型)足以训练后续更多任务,本文还实证探究了针对先前已学琐碎任务的优雅遗忘实现方案。