Deep representation learning methods struggle with continual learning, suffering from both catastrophic forgetting of useful units and loss of plasticity, often due to rigid and unuseful units. While many methods address these two issues separately, only a few currently deal with both simultaneously. In this paper, we introduce Utility-based Perturbed Gradient Descent (UPGD) as a novel approach for the continual learning of representations. UPGD combines gradient updates with perturbations, where it applies smaller modifications to more useful units, protecting them from forgetting, and larger modifications to less useful units, rejuvenating their plasticity. We use a challenging streaming learning setup where continual learning problems have hundreds of non-stationarities and unknown task boundaries. We show that many existing methods suffer from at least one of the issues, predominantly manifested by their decreasing accuracy over tasks. On the other hand, UPGD continues to improve performance and surpasses or is competitive with all methods in all problems. Finally, in extended reinforcement learning experiments with PPO, we show that while Adam exhibits a performance drop after initial learning, UPGD avoids it by addressing both continual learning issues.
翻译:深度表征学习方法在持续学习中面临困境,既遭遇有用单元的灾难性遗忘,又因刚性及无效单元导致塑性丧失。尽管众多方法分别针对这两个问题提出解决方案,但目前仅有少数方法能同时处理二者。本文提出一种新颖的持续表征学习方法——基于效用性的扰动梯度下降(UPGD)。该方法将梯度更新与扰动相结合:对更具效用的单元施加较小修改以保护其免于遗忘,对低效单元则进行较大修改以恢复其可塑性。我们采用具有挑战性的流式学习设定,其中持续学习问题包含数百个非平稳状态及未知任务边界。实验表明,现有方法普遍存在至少一种问题,主要表现为跨任务准确率持续下降。相比之下,UPGD能持续提升性能,在所有问题中均超越或媲美对比方法。最后,在与PPO结合的强化学习扩展实验中,我们发现Adam会在初始学习后出现性能衰退,而UPGD通过同时解决这两类持续学习问题有效规避了该现象。