Continual Learning (CL) involves training a machine learning model in a sequential manner to learn new information while retaining previously learned tasks without the presence of previous training data. Although there has been significant interest in CL, most recent CL approaches in computer vision have focused on convolutional architectures only. However, with the recent success of vision transformers, there is a need to explore their potential for CL. Although there have been some recent CL approaches for vision transformers, they either store training instances of previous tasks or require a task identifier during test time, which can be limiting. This paper proposes a new exemplar-free approach for class/task incremental learning called ConTraCon, which does not require task-id to be explicitly present during inference and avoids the need for storing previous training instances. The proposed approach leverages the transformer architecture and involves re-weighting the key, query, and value weights of the multi-head self-attention layers of a transformer trained on a similar task. The re-weighting is done using convolution, which enables the approach to maintain low parameter requirements per task. Additionally, an image augmentation-based entropic task identification approach is used to predict tasks without requiring task-ids during inference. Experiments on four benchmark datasets demonstrate that the proposed approach outperforms several competitive approaches while requiring fewer parameters.
翻译:持续学习(CL)涉及以序列化方式训练机器学习模型,使其在无先前训练数据的情况下,既能学习新信息又能保留已学任务。尽管CL备受关注,但近期计算机视觉领域的CL方法大多仅关注卷积架构。然而,随着视觉Transformer的成功,亟需探索其在CL中的潜力。尽管已有一些针对视觉Transformer的CL方法,但它们要么存储先前任务的训练实例,要么在测试时需要任务标识符,存在局限性。本文提出了一种名为ConTraCon的新型无需示例的类/任务增量学习方法,该方法无需在推理时显式提供任务标识,且避免存储先前训练实例。所提方法利用Transformer架构,通过重新加权在相似任务上训练的Transformer多头自注意力层的键、查询和值权重实现学习,其中重加权通过卷积完成,从而保持每个任务较低的参数需求。此外,采用基于图像增强的熵任务识别方法,在推理时无需任务标识即可预测任务。在四个基准数据集上的实验表明,所提方法在参数更少的情况下优于多种竞争方法。