Kaizen: Practical Self-supervised Continual Learning with Continual Fine-tuning

from arxiv, Presented at IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024. The code for this work is available at https://github.com/dr-bell/kaizen

Self-supervised learning (SSL) has shown remarkable performance in computer vision tasks when trained offline. However, in a Continual Learning (CL) scenario where new data is introduced progressively, models still suffer from catastrophic forgetting. Retraining a model from scratch to adapt to newly generated data is time-consuming and inefficient. Previous approaches suggested re-purposing self-supervised objectives with knowledge distillation to mitigate forgetting across tasks, assuming that labels from all tasks are available during fine-tuning. In this paper, we generalize self-supervised continual learning in a practical setting where available labels can be leveraged in any step of the SSL process. With an increasing number of continual tasks, this offers more flexibility in the pre-training and fine-tuning phases. With Kaizen, we introduce a training architecture that is able to mitigate catastrophic forgetting for both the feature extractor and classifier with a carefully designed loss function. By using a set of comprehensive evaluation metrics reflecting different aspects of continual learning, we demonstrated that Kaizen significantly outperforms previous SSL models in competitive vision benchmarks, with up to 16.5% accuracy improvement on split CIFAR-100. Kaizen is able to balance the trade-off between knowledge retention and learning from new data with an end-to-end model, paving the way for practical deployment of continual learning systems.

翻译：摘要：自监督学习在离线训练场景下，已在计算机视觉任务中展现出显著性能。然而，在数据逐步引入的持续学习场景中，模型仍面临灾难性遗忘问题。为适应新生成数据而从头重新训练模型耗时且低效。以往研究假设微调阶段可获得所有任务标签，从而采用自监督目标与知识蒸馏相结合的方法来缓解跨任务遗忘。本文在更实际的场景中泛化自监督持续学习——允许在自监督学习流程的任何步骤利用可用标签。随着持续任务数量增加，该方法为预训练和微调阶段提供了更高灵活性。通过Kaizen，我们提出了一种训练架构，借助精心设计的损失函数，能够同时缓解特征提取器和分类器的灾难性遗忘。基于反映持续学习不同方面的全面评估指标，我们证明Kaizen在竞争性视觉基准测试中显著优于以往自监督模型，在分割CIFAR-100上准确率提升高达16.5%。Kaizen通过端到端模型平衡了知识保留与新数据学习间的权衡，为持续学习系统的实际部署铺平了道路。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日