Computer vision models suffer from a phenomenon known as catastrophic forgetting when learning novel concepts from continuously shifting training data. Typical solutions for this continual learning problem require extensive rehearsal of previously seen data, which increases memory costs and may violate data privacy. Recently, the emergence of large-scale pre-trained vision transformer models has enabled prompting approaches as an alternative to data-rehearsal. These approaches rely on a key-query mechanism to generate prompts and have been found to be highly resistant to catastrophic forgetting in the well-established rehearsal-free continual learning setting. However, the key mechanism of these methods is not trained end-to-end with the task sequence. Our experiments show that this leads to a reduction in their plasticity, hence sacrificing new task accuracy, and inability to benefit from expanded parameter capacity. We instead propose to learn a set of prompt components which are assembled with input-conditioned weights to produce input-conditioned prompts, resulting in a novel attention-based end-to-end key-query scheme. Our experiments show that we outperform the current SOTA method DualPrompt on established benchmarks by as much as 4.5% in average final accuracy. We also outperform the state of art by as much as 4.4% accuracy on a continual learning benchmark which contains both class-incremental and domain-incremental task shifts, corresponding to many practical settings. Our code is available at https://github.com/GT-RIPL/CODA-Prompt
翻译:计算机视觉模型在从持续变化训练数据中学习新概念时,会遭遇一种被称为灾难性遗忘的现象。解决这一持续学习问题的典型方案需要对先前见过的数据进行大量重放,这既增加了内存开销,也可能违反数据隐私。近期,大规模预训练视觉Transformer模型的出现,使得提示方法成为数据重放的一个替代方案。这些方法依赖键-查询机制生成提示,在已建立的无重放持续学习场景中被发现对灾难性遗忘具有极高抗性。然而,这些方法中的关键机制并未与任务序列进行端到端联合训练。我们的实验表明,这导致其可塑性降低,因而牺牲了新任务的准确性,且无法从扩展的参数容量中获益。我们转而提出学习一组提示组件,这些组件通过输入条件权重进行组装以生成输入条件化提示,从而形成一种新颖的基于注意力的端到端键-查询方案。实验表明,在既有基准测试中,我们以平均最终准确率超出当前最优方法DualPrompt高达4.5%的优势胜出。在同时包含类别增量与域增量任务迁移(对应众多实际场景)的持续学习基准测试中,我们同样以高达4.4%的准确率超越了当前最优水平。我们的代码已开源:https://github.com/GT-RIPL/CODA-Prompt