Fixed Random Classifier Rearrangement for Continual Learning

With the explosive growth of data, continual learning capability is increasingly important for neural networks. Due to catastrophic forgetting, neural networks inevitably forget the knowledge of old tasks after learning new ones. In visual classification scenario, a common practice of alleviating the forgetting is to constrain the backbone. However, the impact of classifiers is underestimated. In this paper, we analyze the variation of model predictions in sequential binary classification tasks and find that the norm of the equivalent one-class classifiers significantly affects the forgetting level. Based on this conclusion, we propose a two-stage continual learning algorithm named Fixed Random Classifier Rearrangement (FRCR). In first stage, FRCR replaces the learnable classifiers with fixed random classifiers, constraining the norm of the equivalent one-class classifiers without affecting the performance of the network. In second stage, FRCR rearranges the entries of new classifiers to implicitly reduce the drift of old latent representations. The experimental results on multiple datasets show that FRCR significantly mitigates the model forgetting; subsequent experimental analyses further validate the effectiveness of the algorithm.

翻译：随着数据的爆炸式增长，持续学习能力对神经网络日益重要。由于灾难性遗忘，神经网络在学习新任务后不可避免地遗忘旧任务知识。在视觉分类场景中，缓解遗忘的常见做法是约束骨干网络，然而分类器的影响被低估了。本文通过分析序列二分类任务中模型预测的变化，发现等价单类分类器的范数显著影响遗忘程度。基于此结论，我们提出一种名为固定随机分类器重排（FRCR）的两阶段持续学习算法。第一阶段，FRCR将可学习分类器替换为固定随机分类器，在保持网络性能的同时约束等价单类分类器的范数；第二阶段，FRCR通过重排新分类器的条目，隐式减少旧隐层表示的漂移。多数据集实验结果表明，FRCR显著缓解了模型遗忘，后续实验分析进一步验证了该算法的有效性。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日