Unlike primates, training artificial neural networks on changing data distributions leads to a rapid decrease in performance on old tasks. This phenomenon is commonly referred to as catastrophic forgetting. In this paper, we investigate the representational changes that underlie this performance decrease and identify three distinct processes that together account for the phenomenon. The largest component is a misalignment between hidden representations and readout layers. Misalignment occurs due to learning on additional tasks and causes internal representations to shift. Representational geometry is partially conserved under this misalignment and only a small part of the information is irrecoverably lost. All types of representational changes scale with the dimensionality of hidden representations. These insights have implications for deep learning applications that need to be continuously updated, but may also aid aligning ANN models to the rather robust biological vision.
翻译:不同于灵长类动物,在变化的数据分布上训练人工神经网络会导致旧任务性能迅速下降,这一现象通常被称为灾难性遗忘。本文探究了导致性能下降的表征变化,并识别出共同解释该现象的三种不同过程。其中最主要的成分是隐藏表征与读出层之间的错位。这种错位因额外任务的学习而发生,导致内部表征发生偏移。在错位状态下,表征几何结构得到部分保留,仅有一小部分信息不可逆地丢失。所有类型的表征变化均随隐藏表征维度的增加而扩大。这些见解对需要持续更新的深度学习应用具有重要意义,同时也有助于将人工神经网络模型与较为稳健的生物视觉机制对齐。