Elastic Multi-Gradient Descent for Parallel Continual Learning

The goal of Continual Learning (CL) is to continuously learn from new data streams and accomplish the corresponding tasks. Previously studied CL assumes that data are given in sequence nose-to-tail for different tasks, thus indeed belonging to Serial Continual Learning (SCL). This paper studies the novel paradigm of Parallel Continual Learning (PCL) in dynamic multi-task scenarios, where a diverse set of tasks is encountered at different time points. PCL presents challenges due to the training of an unspecified number of tasks with varying learning progress, leading to the difficulty of guaranteeing effective model updates for all encountered tasks. In our previous conference work, we focused on measuring and reducing the discrepancy among gradients in a multi-objective optimization problem, which, however, may still contain negative transfers in every model update. To address this issue, in the dynamic multi-objective optimization problem, we introduce task-specific elastic factors to adjust the descent direction towards the Pareto front. The proposed method, called Elastic Multi-Gradient Descent (EMGD), ensures that each update follows an appropriate Pareto descent direction, minimizing any negative impact on previously learned tasks. To balance the training between old and new tasks, we also propose a memory editing mechanism guided by the gradient computed using EMGD. This editing process updates the stored data points, reducing interference in the Pareto descent direction from previous tasks. Experiments on public datasets validate the effectiveness of our EMGD in the PCL setting.

翻译：持续学习（CL）的目标是从新的数据流中持续学习并完成相应任务。以往研究的CL假设数据按任务顺序从头到尾依次给出，本质上属于串行持续学习（SCL）。本文研究了动态多任务场景下的新型范式——并行持续学习（PCL），其中不同时间点上会遇到多样化的任务集合。由于需要训练学习进度各异的未指定数量任务，PCL带来了挑战，导致难以保证对所有遇到的任务进行有效模型更新。在之前的会议工作中，我们聚焦于测量和减少多目标优化问题中梯度间的差异，但该方法在每次模型更新中仍可能包含负迁移。为解决此问题，在动态多目标优化问题中，我们引入任务特定弹性因子来调整向帕累托前沿下降的方向。所提出的方法称为弹性多梯度下降算法（EMGD），确保每次更新遵循适当的帕累托下降方向，最小化对先前学习任务的负面影响。为平衡新旧任务训练，我们还提出了由EMGD计算梯度引导的记忆编辑机制。该编辑过程更新存储的数据点，减少来自先前任务的帕累托下降方向干扰。在公开数据集上的实验验证了所提出的EMGD在PCL设置中的有效性。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日