Self-supervised learning (SSL) has shown remarkable performance in computer vision tasks when trained offline. However, in a Continual Learning (CL) scenario where new data is introduced progressively, models still suffer from catastrophic forgetting. Retraining a model from scratch to adapt to newly generated data is time-consuming and inefficient. Previous approaches suggested re-purposing self-supervised objectives with knowledge distillation to mitigate forgetting across tasks, assuming that labels from all tasks are available during fine-tuning. In this paper, we generalize self-supervised continual learning in a practical setting where available labels can be leveraged in any step of the SSL process. With an increasing number of continual tasks, this offers more flexibility in the pre-training and fine-tuning phases. With Kaizen, we introduce a training architecture that is able to mitigate catastrophic forgetting for both the feature extractor and classifier with a carefully designed loss function. By using a set of comprehensive evaluation metrics reflecting different aspects of continual learning, we demonstrated that Kaizen significantly outperforms previous SSL models in competitive vision benchmarks, with up to 16.5% accuracy improvement on split CIFAR-100. Kaizen is able to balance the trade-off between knowledge retention and learning from new data with an end-to-end model, paving the way for practical deployment of continual learning systems.
翻译:自监督学习(SSL)在离线训练下,于计算机视觉任务中展现出卓越性能。然而,在持续学习(CL)场景中,当新数据逐步引入时,模型仍面临灾难性遗忘问题。为适应新生成数据而从头重新训练模型不仅耗时且效率低下。以往方法建议通过知识蒸馏重设自监督目标来缓解任务间的遗忘,但前提是微调阶段所有任务的标签均可获取。本文在实用场景中推广了自监督持续学习,使可用标签可在自监督学习流程的任意阶段被利用。随着持续任务数量的增加,这为预训练和微调阶段提供了更高的灵活性。我们提出Kaizen训练架构,通过精心设计的损失函数,能同时缓解特征提取器和分类器的灾难性遗忘。采用一组反映持续学习不同维度的综合评估指标,我们证明Kaizen在竞争性视觉基准测试中显著优于以往自监督学习模型,在分割的CIFAR-100数据集上准确率提升高达16.5%。Kaizen通过端到端模型平衡了知识保留与新数据学习的权衡,为持续学习系统的实际部署铺平了道路。