Few-shot class-incremental learning (FSCIL) aims to incrementally learn models from a small amount of novel data, which requires strong representation and adaptation ability of models learned under few-example supervision to avoid catastrophic forgetting on old classes and overfitting to novel classes. This work proposes a generative co-memory regularization approach to facilitate FSCIL. In the approach, the base learning leverages generative domain adaptation finetuning to finetune a pretrained generative encoder on a few examples of base classes by jointly incorporating a masked autoencoder (MAE) decoder for feature reconstruction and a fully-connected classifier for feature classification, which enables the model to efficiently capture general and adaptable representations. Using the finetuned encoder and learned classifier, we construct two class-wise memories: representation memory for storing the mean features for each class, and weight memory for storing the classifier weights. After that, the memory-regularized incremental learning is performed to train the classifier dynamically on the examples of few-shot classes in each incremental session by simultaneously optimizing feature classification and co-memory regularization. The memories are updated in a class-incremental manner and they collaboratively regularize the incremental learning. In this way, the learned models improve recognition accuracy, while mitigating catastrophic forgetting over old classes and overfitting to novel classes. Extensive experiments on popular benchmarks clearly demonstrate that our approach outperforms the state-of-the-arts.
翻译:少样本类增量学习(FSCIL)旨在从少量新数据中增量地学习模型,这要求模型在少量样本监督下学习到的表示和适应能力足够强,以避免对旧类的灾难性遗忘和对新类的过拟合。本研究提出一种生成协同记忆正则化方法来促进FSCIL。在该方法中,基础学习利用生成式域适应微调,通过联合引入掩码自编码器(MAE)解码器进行特征重构和全连接分类器进行特征分类,在少量基类样本上微调预训练的生成式编码器,从而使模型能够高效捕获通用且可适应的表示。利用微调后的编码器和学习到的分类器,我们构建了两个类级记忆:用于存储各类平均特征的表示记忆,以及用于存储分类器权重的权重记忆。此后,执行记忆正则化增量学习,通过同时优化特征分类和协同记忆正则化,在每个增量会话中基于少样本类的示例动态训练分类器。记忆以类增量的方式更新,并协同正则化增量学习过程。通过这种方式,学习到的模型在提高识别准确率的同时,缓解了对旧类的灾难性遗忘和对新类的过拟合。在多个主流基准上的大量实验清晰表明,我们的方法优于现有最优方法。