Enhanced Few-Shot Class-Incremental Learning via Ensemble Models

Few-shot class-incremental learning (FSCIL) aims to continually fit new classes with limited training data, while maintaining the performance of previously learned classes. The main challenges are overfitting the rare new training samples and forgetting old classes. While catastrophic forgetting has been extensively studied, the overfitting problem has attracted less attention in FSCIL. To tackle overfitting challenge, we design a new ensemble model framework cooperated with data augmentation to boost generalization. In this way, the enhanced model works as a library storing abundant features to guarantee fast adaptation to downstream tasks. Specifically, the multi-input multi-output ensemble structure is applied with a spatial-aware data augmentation strategy, aiming at diversifying the feature extractor and alleviating overfitting in incremental sessions. Moreover, self-supervised learning is also integrated to further improve the model generalization. Comprehensive experimental results show that the proposed method can indeed mitigate the overfitting problem in FSCIL, and outperform the state-of-the-art methods.

翻译：少样本类增量学习旨在利用有限的训练数据持续适应新类别，同时保持对已学类别的性能。其主要挑战在于对稀缺的新训练样本过拟合以及遗忘旧类别。尽管灾难性遗忘已得到广泛研究，但过拟合问题在少样本类增量学习中受到的关注较少。为解决过拟合挑战，我们设计了一种与数据增强协同工作的新型集成模型框架，以提升泛化能力。通过这种方式，增强后的模型可作为存储丰富特征的库，确保快速适应下游任务。具体而言，多输入多输出集成结构结合了空间感知数据增强策略，旨在多样化特征提取器并缓解增量会话中的过拟合问题。此外，自监督学习也被集成进来以进一步改善模型泛化能力。综合实验结果表明，所提方法确实能够缓解少样本类增量学习中的过拟合问题，并优于现有最先进方法。

相关内容

过拟合

关注 8

过拟合，在AI领域多指机器学习得到模型太过复杂，导致在训练集上表现很好，然而在测试集上却不尽人意。过拟合（over-fitting）也称为过学习，它的直观表现是算法在训练集上表现好，但在测试集上表现不好，泛化性能差。过拟合是在模型参数拟合过程中由于训练数据包含抽样误差，在训练时复杂的模型将抽样误差也进行了拟合导致的。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日