In continual learning, plasticity refers to the ability of an agent to quickly adapt to new information. Neural networks are known to lose plasticity when processing non-stationary data streams. In this paper, we propose L2 Init, a very simple approach for maintaining plasticity by incorporating in the loss function L2 regularization toward initial parameters. This is very similar to standard L2 regularization (L2), the only difference being that L2 regularizes toward the origin. L2 Init is simple to implement and requires selecting only a single hyper-parameter. The motivation for this method is the same as that of methods that reset neurons or parameter values. Intuitively, when recent losses are insensitive to particular parameters, these parameters drift toward their initial values. This prepares parameters to adapt quickly to new tasks. On simple problems representative of different types of nonstationarity in continual learning, we demonstrate that L2 Init consistently mitigates plasticity loss. We additionally find that our regularization term reduces parameter magnitudes and maintains a high effective feature rank.
翻译:在持续学习中,塑性是指智能体快速适应新信息的能力。已知神经网络在处理非平稳数据流时会丧失塑性。本文提出了一种极其简单的方法——L2 Init,通过将L2正则化项融入损失函数并约束参数向初始值靠拢来保持塑性。该方法与标准L2正则化(将参数向原点正则化)高度相似,唯一区别在于目标点不同。L2 Init实现简便,仅需选择单一超参数。其设计动机与重置神经元或参数值的方法一致:当近期损失对特定参数不敏感时,这些参数会逐渐向初始值偏移,从而为快速适应新任务做好准备。在代表持续学习中不同类型非平稳性的简单问题上,我们证实L2 Init能持续缓解塑性损失。此外,我们发现的的正则化项可降低参数幅度并保持高效特征秩。