EsaCL: Efficient Continual Learning of Sparse Models

A key challenge in the continual learning setting is to efficiently learn a sequence of tasks without forgetting how to perform previously learned tasks. Many existing approaches to this problem work by either retraining the model on previous tasks or by expanding the model to accommodate new tasks. However, these approaches typically suffer from increased storage and computational requirements, a problem that is worsened in the case of sparse models due to need for expensive re-training after sparsification. To address this challenge, we propose a new method for efficient continual learning of sparse models (EsaCL) that can automatically prune redundant parameters without adversely impacting the model's predictive power, and circumvent the need of retraining. We conduct a theoretical analysis of loss landscapes with parameter pruning, and design a directional pruning (SDP) strategy that is informed by the sharpness of the loss function with respect to the model parameters. SDP ensures model with minimal loss of predictive accuracy, accelerating the learning of sparse models at each stage. To accelerate model update, we introduce an intelligent data selection (IDS) strategy that can identify critical instances for estimating loss landscape, yielding substantially improved data efficiency. The results of our experiments show that EsaCL achieves performance that is competitive with the state-of-the-art methods on three continual learning benchmarks, while using substantially reduced memory and computational resources.

翻译：持续学习场景中的一个关键挑战是高效学习一系列任务而不遗忘先前学到的任务。现有方法通常通过重新训练先前任务模型或扩展模型以适应新任务来解决此问题。然而，这些方法通常面临存储和计算需求增加的问题，而在稀疏模型中，由于稀疏化后需要昂贵的重新训练，这一问题更为突出。为应对这一挑战，我们提出了一种稀疏模型高效持续学习新方法（EsaCL），该方法能在不损害模型预测能力的前提下自动剪枝冗余参数，并避免重新训练的需求。我们从理论上分析了带参数剪枝的损失景观，并设计了一种基于损失函数对模型参数锐度信息的定向剪枝策略（SDP）。SDP确保模型在最小化预测精度损失的同时加速各阶段稀疏模型的学习。为加速模型更新，我们引入了一种智能数据选择策略（IDS），该策略能识别用于估计损失景观的关键实例，显著提升数据效率。实验结果表明，EsaCL在三个持续学习基准上取得了与最先进方法竞争的性能，同时大幅降低了存储和计算资源消耗。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日