By default, neural networks learn on all training data at once. When such a model is trained on sequential chunks of new data, it tends to catastrophically forget how to handle old data. In this work we investigate how continual learners learn and forget representations. We observe two phenomena: knowledge accumulation, i.e. the improvement of a representation over time, and feature forgetting, i.e. the loss of task-specific representations. To better understand both phenomena, we introduce a new analysis technique called task exclusion comparison. If a model has seen a task and it has not forgotten all the task-specific features, then its representation for that task should be better than that of a model that was trained on similar tasks, but not that exact one. Our image classification experiments show that most task-specific features are quickly forgotten, in contrast to what has been suggested in the past. Further, we demonstrate how some continual learning methods, like replay, and ideas from representation learning affect a continually learned representation. We conclude by observing that representation quality is tightly correlated with continual learning performance.
翻译:默认情况下,神经网络一次性学习所有训练数据。当此类模型在连续的新数据块上进行训练时,往往会出现灾难性遗忘——即丧失处理旧数据的能力。本研究探究持续学习模型如何学习与遗忘表征。我们观察到两种现象:知识积累(即表征随时间推移而改进)与特征遗忘(即任务特定表征的丢失)。为深入理解这两种现象,我们提出一种新型分析技术——任务排除比较。若模型曾接触某项任务且未遗忘其全部任务特定特征,则对该任务的表征应优于经过类似任务训练但未接触该特定任务的模型。图像分类实验表明,多数任务特定特征会快速遗忘,这与既往研究结论相悖。进一步地,我们展示了部分持续学习方法(如回放机制)及表征学习理念如何影响持续学习表征。最终我们观察到,表征质量与持续学习性能紧密相关。