Provable Contrastive Continual Learning

Continual learning requires learning incremental tasks with dynamic data distributions. So far, it has been observed that employing a combination of contrastive loss and distillation loss for training in continual learning yields strong performance. To the best of our knowledge, however, this contrastive continual learning framework lacks convincing theoretical explanations. In this work, we fill this gap by establishing theoretical performance guarantees, which reveal how the performance of the model is bounded by training losses of previous tasks in the contrastive continual learning framework. Our theoretical explanations further support the idea that pre-training can benefit continual learning. Inspired by our theoretical analysis of these guarantees, we propose a novel contrastive continual learning algorithm called CILA, which uses adaptive distillation coefficients for different tasks. These distillation coefficients are easily computed by the ratio between average distillation losses and average contrastive losses from previous tasks. Our method shows great improvement on standard benchmarks and achieves new state-of-the-art performance.

翻译：持续学习要求学习具有动态数据分布的增量任务。迄今为止，已有研究观察到，在持续学习中采用对比损失与蒸馏损失相结合的方式进行训练能取得优异性能。然而，据我们所知，这一对比持续学习框架尚缺乏令人信服的理论解释。本工作中，我们通过建立理论性能保证填补了这一空白，揭示了在对比持续学习框架下模型性能如何受先前任务训练损失的约束。我们的理论解释进一步支持了预训练有益于持续学习的观点。受这些理论保证分析的启发，我们提出了一种名为CILA的新型对比持续学习算法，该算法为不同任务使用自适应的蒸馏系数。这些蒸馏系数可通过先前任务的平均蒸馏损失与平均对比损失之比轻松计算得出。我们的方法在标准基准测试中显示出显著改进，并实现了新的最先进性能。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日