Knowledge Accumulation in Continually Learned Representations and the Issue of Feature Forgetting

While it is established that neural networks suffer from catastrophic forgetting ``at the output level'', it is debated whether this is also the case at the level of representations. Some studies ascribe a certain level of innate robustness to representations, that they only forget minimally and no critical information, while others claim that representations are also severely affected by forgetting. To settle this debate, we first discuss how this apparent disagreement might stem from the coexistence of two phenomena that affect the quality of continually learned representations: knowledge accumulation and feature forgetting. We then show that, even though it is true that feature forgetting can be small in absolute terms, newly learned information is forgotten just as catastrophically at the level of representations as it is at the output level. Next we show that this feature forgetting is problematic as it substantially slows down knowledge accumulation. We further show that representations that are continually learned through both supervised and self-supervised learning suffer from feature forgetting. Finally, we study how feature forgetting and knowledge accumulation are affected by different types of continual learning methods.

翻译：虽然神经网络在“输出层面”会发生灾难性遗忘已成共识，但在表征层面是否同样存在这一现象仍存争议。部分研究认为表征具有某种程度的固有鲁棒性，仅发生极少量且不涉及关键信息的遗忘，而另一些研究则指出表征同样受到遗忘的严重影响。为解答这一争议，我们首先讨论这种明显分歧可能源于两种同时影响持续学习表征质量的现象：知识积累与特征遗忘。随后我们证明，尽管特征遗忘的绝对幅度可能较小，但新学习信息在表征层面的遗忘速度与输出层面同样严重。接着我们表明这种特征遗忘会显著阻碍知识积累，构成实质性问题。我们进一步证明，通过监督学习和自监督学习持续获得的表征均存在特征遗忘问题。最后，我们研究了不同持续学习方法对特征遗忘和知识积累的影响。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日