Transforming Transformers for Resilient Lifelong Learning

Lifelong learning without catastrophic forgetting (i.e., resiliency) remains an open problem for deep neural networks. The prior art mostly focuses on convolutional neural networks. With the increasing dominance of Transformers in deep learning, it is a pressing need to study lifelong learning with Transformers. Due to the complexity of training Transformers in practice, for lifelong learning, a question naturally arises: Can Transformers be learned to grow in a task aware way, that is to be dynamically transformed by introducing lightweight learnable plastic components to the architecture, while retaining the parameter-heavy, but stable components at streaming tasks? To that end, motivated by the lifelong learning capability maintained by the functionality of Hippocampi in human brain, we explore what would be, and how to implement, Artificial Hippocampi (ArtiHippo) in Transformers. We present a method to identify, and learn to grow, ArtiHippo in Vision Transformers (ViTs) for resilient lifelong learning in four aspects: (i) Where to place ArtiHippo to enable plasticity while preserving the core function of ViTs at streaming tasks? (ii) How to represent and realize ArtiHippo to ensure expressivity and adaptivity for tackling tasks of different nature in lifelong learning? (iii) How to learn to grow ArtiHippo to exploit task synergies (i.e., the learned knowledge) and overcome catastrophic forgetting? (iv) How to harness the best of our proposed ArtiHippo and prompting-based approaches? In experiments, we test the proposed method on the challenging Visual Domain Decathlon (VDD) benchmark and the 5-Dataset benchmark under the task-incremental lifelong learning setting. It obtains consistently better performance than the prior art with sensible ArtiHippo learned continually. To our knowledge, it is the first attempt of lifelong learning with ViTs on the challenging VDD benchmark.

翻译：如何在不发生灾难性遗忘（即保持韧性）的前提下实现终身学习，仍是深度神经网络面临的开放问题。现有研究主要聚焦于卷积神经网络。随着Transformer在深度学习领域的主导地位日益增强，亟需开展基于Transformer的终身学习研究。由于实际应用中训练Transformer的复杂性，一个自然涌现的问题是：能否以任务感知方式学习让Transformer动态成长——即在维持处理流式任务时参数密集但稳定的架构组件的同时，通过引入轻量级可塑性学习组件实现动态改造？为此，受人类大脑海马体功能所维持终身学习能力的启发，我们探索了Transformer中人工海马体的定义及其实现方法。本文提出在视觉Transformer中识别并学习生成人工海马体的方法，从四个维度实现韧性终身学习：（i）如何在保持视觉Transformer核心功能的同时确定人工海马体的植入位置以赋予可塑性？（ii）如何表示和实现人工海马体，确保其对终身学习中不同性质任务具备表达力与适应性？（iii）如何通过学习生成人工海马体，利用任务协同效应（即已习得知识）并克服灾难性遗忘？（iv）如何协同优化所提出的人工海马体与基于提示的方法？实验阶段，我们在具有挑战性的视觉领域十项全能基准和五数据集基准上，采用任务增量式终身学习设置对方法进行测试。结果表明，通过持续学习获得合理人工海马体后，该方法始终优于现有技术。据我们所知，这是首次在挑战性视觉领域十项全能基准上实现基于视觉Transformer的终身学习。