State Space Models (SSMs) have emerged as efficient alternatives to Transformers, mitigating their quadratic computational cost. However, the application of Parameter-Efficient Fine-Tuning (PEFT) methods to SSMs remains largely unexplored. In particular, prompt-based methods like Prompt Tuning and Prefix-Tuning, which are widely used in Transformers, do not perform well on SSMs. To address this, we propose state-based methods as a superior alternative to prompt-based methods. This new family of methods naturally stems from the architectural characteristics of SSMs. State-based methods adjust state-related features directly instead of depending on external prompts. Furthermore, we introduce a novel state-based PEFT method: State-offset Tuning. At every timestep, our method directly affects the state at the current step, leading to more effective adaptation. Through extensive experiments across diverse datasets, we demonstrate the effectiveness of our method. Code is available at https://github.com/furiosa-ai/ssm-state-tuning.
翻译:状态空间模型(SSMs)已成为Transformer的高效替代方案,有效缓解了其二次计算成本问题。然而,参数高效微调(PEFT)方法在SSMs中的应用仍鲜有探索。特别是像Prompt Tuning和Prefix-Tuning这类在Transformer中广泛应用的基于提示的方法,在SSMs上表现不佳。为解决这一问题,我们提出基于状态的方法作为基于提示方法的优越替代方案。这一新方法族自然源于SSMs的架构特性。基于状态的方法直接调整状态相关特征,而非依赖外部提示。此外,我们提出了一种新颖的基于状态PEFT方法:状态偏移调优。在每个时间步,我们的方法直接影响当前步骤的状态,从而实现更有效的适配。通过在多样化数据集上的大量实验,我们验证了该方法的有效性。代码发布于https://github.com/furiosa-ai/ssm-state-tuning。