The aim of continual learning is to learn new tasks continuously (i.e., plasticity) without forgetting previously learned knowledge from old tasks (i.e., stability). In the scenario of online continual learning, wherein data comes strictly in a streaming manner, the plasticity of online continual learning is more vulnerable than offline continual learning because the training signal that can be obtained from a single data point is limited. To overcome the stability-plasticity dilemma in online continual learning, we propose an online continual learning framework named multi-scale feature adaptation network (MuFAN) that utilizes a richer context encoding extracted from different levels of a pre-trained network. Additionally, we introduce a novel structure-wise distillation loss and replace the commonly used batch normalization layer with a newly proposed stability-plasticity normalization module to train MuFAN that simultaneously maintains high plasticity and stability. MuFAN outperforms other state-of-the-art continual learning methods on the SVHN, CIFAR100, miniImageNet, and CORe50 datasets. Extensive experiments and ablation studies validate the significance and scalability of each proposed component: 1) multi-scale feature maps from a pre-trained encoder, 2) the structure-wise distillation loss, and 3) the stability-plasticity normalization module in MuFAN. Code is publicly available at https://github.com/whitesnowdrop/MuFAN.
翻译:持续学习的目标是在不遗忘旧任务中已学知识(即稳定性)的前提下,持续学习新任务(即可塑性)。在在线持续学习场景中,数据严格以流式方式到达,由于单个数据点能获得的训练信号有限,在线持续学习的可塑性比离线持续学习更为脆弱。为解决在线持续学习中的稳定性-可塑性困境,我们提出了一种名为多尺度特征适应网络(MuFAN)的在线持续学习框架,该框架利用从预训练网络不同层级提取的丰富上下文编码。此外,我们引入了一种新颖的结构级蒸馏损失,并用新提出的稳定性-可塑性归一化模块替代常用的批归一化层,以训练同时保持高可塑性与高稳定性的MuFAN。在SVHN、CIFAR100、miniImageNet和CORe50数据集上,MuFAN的表现优于其他最先进的持续学习方法。大量实验和消融研究验证了各提出组件的有效性和可扩展性:1)来自预训练编码器的多尺度特征图,2)结构级蒸馏损失,以及3)MuFAN中的稳定性-可塑性归一化模块。代码已公开于https://github.com/whitesnowdrop/MuFAN。