Modern representation learning methods often struggle to adapt quickly under non-stationarity because they suffer from catastrophic forgetting and decaying plasticity. Such problems prevent learners from fast adaptation since they may forget useful features or have difficulty learning new ones. Hence, these methods are rendered ineffective for continual learning. This paper proposes Utility-based Perturbed Gradient Descent (UPGD), an online learning algorithm well-suited for continual learning agents. UPGD protects useful weights or features from forgetting and perturbs less useful ones based on their utilities. Our empirical results show that UPGD helps reduce forgetting and maintain plasticity, enabling modern representation learning methods to work effectively in continual learning.
翻译:现代表示学习方法通常难以在非平稳条件下快速适应,原因是存在灾难性遗忘和塑性衰退问题。这些问题阻碍了学习器的快速适应能力,因为其可能遗忘有用特征或难以学习新特征。因此,这些方法在持续学习中效果不佳。本文提出基于效用的扰动梯度下降(Utility-based Perturbed Gradient Descent, UPGD),这是一种适用于持续学习智能体的在线学习算法。UPGD通过保护有用的权重或特征免受遗忘,并根据其效用扰动作用较弱的特征。我们的实验结果表明,UPGD有助于减少遗忘并维持塑性,使现代表示学习方法能够在持续学习中有效运作。