Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning

Few-shot class-incremental learning (FSCIL) confronts the challenge of integrating new classes into a model with minimal training samples while preserving the knowledge of previously learned classes. Traditional methods widely adopt static adaptation relying on a fixed parameter space to learn from data that arrive sequentially, prone to overfitting to the current session. Existing dynamic strategies require the expansion of the parameter space continually, leading to increased complexity. In this study, we explore the potential of Selective State Space Models (SSMs) for FSCIL, leveraging its dynamic weights and strong ability in sequence modeling to address these challenges. Concretely, we propose a dual selective SSM projector that dynamically adjusts the projection parameters based on the intermediate features for dynamic adaptation. The dual design enables the model to maintain the robust features of base classes, while adaptively learning distinctive feature shifts for novel classes. Additionally, we develop a class-sensitive selective scan mechanism to guide dynamic adaptation. It minimizes the disruption to base-class representations caused by training on novel data, and meanwhile, forces the selective scan to perform in distinct patterns between base and novel classes. Experiments on miniImageNet, CUB-200, and CIFAR-100 demonstrate that our framework outperforms the existing state-of-the-art methods. The code is available at \url{https://github.com/xiaojieli0903/Mamba-FSCIL}.

翻译：小样本类增量学习（FSCIL）面临的核心挑战是，如何在仅使用少量训练样本的情况下将新类整合到模型中，同时保持对已学类别的知识。传统方法广泛采用静态适应策略，依赖固定的参数空间从顺序到达的数据中学习，容易对当前会话产生过拟合。现有的动态策略则需要持续扩展参数空间，导致模型复杂度增加。本研究探索了选择性状态空间模型（SSMs）在FSCIL任务中的潜力，利用其动态权重和强大的序列建模能力应对上述挑战。具体而言，我们提出了一种双重选择性SSM投影器，该模块能够根据中间特征动态调整投影参数以实现动态适应。双重设计使模型能够保持基础类别的鲁棒特征，同时自适应地学习新类别的独特特征偏移。此外，我们开发了一种类敏感选择性扫描机制来引导动态适应过程。该机制最小化新数据训练对基础类别表征的干扰，同时迫使选择性扫描在基础类别与新类别之间执行差异化的模式。在miniImageNet、CUB-200和CIFAR-100数据集上的实验表明，本框架性能优于现有的最先进方法。代码发布于\url{https://github.com/xiaojieli0903/Mamba-FSCIL}。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日