It has been observed that neural networks perform poorly when the data or tasks are presented sequentially. Unlike humans, neural networks suffer greatly from catastrophic forgetting, making it impossible to perform life-long learning. To address this issue, memory-based continual learning has been actively studied and stands out as one of the best-performing methods. We examine memory-based continual learning and identify that large variation in the representation space is crucial for avoiding catastrophic forgetting. Motivated by this, we propose to diversify representations by using two types of perturbations: model-agnostic variation (i.e., the variation is generated without the knowledge of the learned neural network) and model-based variation (i.e., the variation is conditioned on the learned neural network). We demonstrate that enlarging representational variation serves as a general principle to improve continual learning. Finally, we perform empirical studies which demonstrate that our method, as a simple plug-and-play component, can consistently improve a number of memory-based continual learning methods by a large margin.
翻译:研究发现,当数据或任务以序列形式呈现时,神经网络的表现较差。与人类不同,神经网络严重受困于灾难性遗忘问题,导致无法实现终身学习。针对这一难题,基于记忆的持续学习方法受到广泛关注,并成为当前表现最佳的技术路线之一。我们对基于记忆的持续学习进行深入分析,发现表征空间中的大尺度变体对避免灾难性遗忘至关重要。基于此发现,我们提出通过两种扰动类型实现表征多样化:与模型无关的变体(即在不依赖已学习神经网络知识的条件下生成变体)和基于模型的变体(即基于已学习神经网络的条件生成变体)。研究表明,扩大表征变体可作为改进持续学习的通用原则。最终,通过实证研究证明,我们提出的方法作为一种即插即用组件,能够显著提升多种基于记忆的持续学习方法的性能。