Continual learning (CL) in deep neural networks (DNNs) involves incrementally accumulating knowledge in a DNN from a growing data stream. A major challenge in CL is that non-stationary data streams cause catastrophic forgetting of previously learned abilities. Rehearsal is a popular and effective way to mitigate this problem, which is storing past observations in a buffer and mixing them with new observations during learning. This leads to a question: Which stored samples should be selected for rehearsal? Choosing samples that are best for learning, rather than simply selecting them at random, could lead to significantly faster learning. For class incremental learning, prior work has shown that a simple class balanced random selection policy outperforms more sophisticated methods. Here, we revisit this question by exploring a new sample selection policy called GRASP. GRASP selects the most prototypical (class representative) samples first and then gradually selects less prototypical (harder) examples to update the DNN. GRASP has little additional compute or memory overhead compared to uniform selection, enabling it to scale to large datasets. We evaluate GRASP and other policies by conducting CL experiments on the large-scale ImageNet-1K and Places-LT image classification datasets. GRASP outperforms all other rehearsal policies. Beyond vision, we also demonstrate that GRASP is effective for CL on five text classification datasets.
翻译:持续学习(CL)在深度神经网络(DNN)中涉及从不断增长的数据流中逐步积累知识。CL的一个主要挑战是非平稳数据流会导致先前学习能力的灾难性遗忘。排练是缓解该问题的流行且有效的方法,即将过去的观察结果存储在缓冲区中,并在学习过程中将其与新观察结果混合。这引出一个问题:应选择哪些存储样本进行排练?选择最有利于学习的样本(而非随机选取)可能显著加速学习。对于类别增量学习,先前研究表明,简单的类别平衡随机选择策略优于更复杂的方法。在此,我们通过探索一种名为GRASP的新样本选择策略重新审视该问题。GRASP首先选择最具原型性(类别代表性)的样本,然后逐步选择原型性较弱(更困难)的样本以更新DNN。与均匀选择相比,GRASP几乎没有额外的计算或内存开销,使其能够扩展到大型数据集。我们通过在大型ImageNet-1K和Places-LT图像分类数据集上进行CL实验来评估GRASP及其他策略。GRASP优于所有其他排练策略。除视觉领域外,我们还证明GRASP在五个文本分类数据集上对CL同样有效。