Few-shot class-incremental learning (FSCIL) is a paradigm where a model, initially trained on a dataset of base classes, must adapt to an expanding problem space by recognizing novel classes with limited data. We focus on the challenging FSCIL setup where a model receives only a single sample (1-shot) for each novel class and no further training or model alterations are allowed after the base training phase. This makes generalization to novel classes particularly difficult. We propose a novel approach predicated on the hypothesis that base and novel class embeddings have structural similarity. We map the original embedding space into a residual space by subtracting the class prototype (i.e., the average class embedding) of input samples. Then, we leverage generative modeling with VAE or diffusion models to learn the multi-modal distribution of residuals over the base classes, and we use this as a valuable structural prior to improve recognition of novel classes. Our approach, Gen1S, consistently improves novel class recognition over the state of the art across multiple benchmarks and backbone architectures.
翻译:少样本类增量学习(FSCIL)是一种学习范式,其中模型首先在基础类数据集上进行训练,随后必须通过有限的数据识别新类,以适应不断扩展的问题空间。我们专注于一种具有挑战性的FSCIL设置:模型对于每个新类仅接收单个样本(单样本),且在基础训练阶段后不允许进行任何进一步的训练或模型修改。这使得模型对新类的泛化变得尤为困难。我们提出了一种新颖的方法,其基于以下假设:基础类和新类的嵌入具有结构相似性。我们通过减去输入样本的类原型(即类嵌入的平均值),将原始嵌入空间映射到一个残差空间。然后,我们利用变分自编码器(VAE)或扩散模型等生成模型来学习基础类上残差的多模态分布,并将其作为一种宝贵的结构先验,以提升对新类的识别能力。我们的方法Gen1S在多个基准测试和骨干网络架构上,持续地提升了新类识别的性能,超越了现有技术水平。