Preserving Linear Separability in Continual Learning by Backward Feature Projection

Catastrophic forgetting has been a major challenge in continual learning, where the model needs to learn new tasks with limited or no access to data from previously seen tasks. To tackle this challenge, methods based on knowledge distillation in feature space have been proposed and shown to reduce forgetting. However, most feature distillation methods directly constrain the new features to match the old ones, overlooking the need for plasticity. To achieve a better stability-plasticity trade-off, we propose Backward Feature Projection (BFP), a method for continual learning that allows the new features to change up to a learnable linear transformation of the old features. BFP preserves the linear separability of the old classes while allowing the emergence of new feature directions to accommodate new classes. BFP can be integrated with existing experience replay methods and boost performance by a significant margin. We also demonstrate that BFP helps learn a better representation space, in which linear separability is well preserved during continual learning and linear probing achieves high classification accuracy. The code can be found at https://github.com/rvl-lab-utoronto/BFP

翻译：灾难性遗忘一直是持续学习中的主要挑战，即模型需要在有限或无法访问先前任务数据的情况下学习新任务。为应对这一挑战，研究者提出了基于特征空间知识蒸馏的方法，并证明其能有效减少遗忘。然而，大多数特征蒸馏方法直接约束新特征与旧特征匹配，忽视了可塑性需求。为实现更好的稳定性-可塑性权衡，我们提出反向特征投影（Backward Feature Projection, BFP）——一种持续学习方法，允许新特征通过旧特征的可学习线性变换进行调整。BFP在保留旧类别线性可分性的同时，允许新特征方向的出现以适应新类别。BFP可与现有经验重放方法集成，并显著提升性能。我们还证明BFP有助于学习更优的表示空间，在该空间中持续学习过程中线性可分性得到良好保持，线性探测能实现高分类准确率。代码见 https://github.com/rvl-lab-utoronto/BFP

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日