In continual learning, plasticity refers to the ability of an agent to quickly adapt to new information. Neural networks are known to lose plasticity when processing non-stationary data streams. In this paper, we propose L2 Init, a simple approach for maintaining plasticity by incorporating in the loss function L2 regularization toward initial parameters. This is very similar to standard L2 regularization (L2), the only difference being that L2 regularizes toward the origin. L2 Init is simple to implement and requires selecting only a single hyper-parameter. The motivation for this method is the same as that of methods that reset neurons or parameter values. Intuitively, when recent losses are insensitive to particular parameters, these parameters should drift toward their initial values. This prepares parameters to adapt quickly to new tasks. On problems representative of different types of nonstationarity in continual supervised learning, we demonstrate that L2 Init most consistently mitigates plasticity loss compared to previously proposed approaches.
翻译:在持续学习中,可塑性指智能体快速适应新信息的能力。已知神经网络在处理非平稳数据流时会丧失可塑性。本文提出L2初始化(L2 Init)这一简洁方法,通过将面向初始参数的L2正则化纳入损失函数来保持可塑性。该方法与标准L2正则化高度相似,唯一区别在于标准L2向原点正则化。L2初始化实现简单,仅需选择单一超参数。该方法与重置神经元或参数值的方法动机相同。直观而言,当近期损失对特定参数不敏感时,这些参数应向初始值漂移,从而为快速适应新任务做好准备。在持续监督学习中代表不同类型非平稳性的问题上,我们证明L2初始化相比此前提出的方法能最稳定地缓解可塑性损失。