State-Space Models (SSMs) excel at capturing long-range dependencies with structured recurrence, making them well-suited for sequence modeling. However, their evolving internal states pose challenges in adapting them under Continual Learning (CL). This is particularly difficult in exemplar-free settings, where the absence of prior data leaves updates to the dynamic SSM states unconstrained, resulting in catastrophic forgetting. To address this, we propose Inf-SSM, a novel and simple geometry-aware regularization method that utilizes the geometry of the infinite-dimensional Grassmannian to constrain state evolution during CL. Unlike classical continual learning methods that constrain weight updates, Inf-SSM regularizes the infinite-horizon evolution of SSMs encoded in their extended observability subspace. We show that enforcing this regularization requires solving a matrix equation known as the Sylvester equation, which typically incurs $\mathcal{O}(n^3)$ complexity. We develop a $\mathcal{O}(n^2)$ solution by exploiting the structure and properties of SSMs. This leads to an efficient regularization mechanism that can be seamlessly integrated into existing CL methods. Comprehensive experiments on challenging benchmarks, including ImageNet-R and Caltech-256, demonstrate a significant reduction in forgetting while improving accuracy across sequential tasks.
翻译:状态空间模型(SSMs)凭借其结构化的递归机制在处理长程依赖方面表现出色,因此非常适用于序列建模。然而,其不断演化的内部状态给连续学习(CL)中的模型适配带来了挑战。在无样本设置中这一困难尤为突出:由于缺乏先前数据,动态SSM状态的更新不受约束,导致灾难性遗忘。为解决此问题,我们提出Inf-SSM——一种新颖而简单的几何感知正则化方法,该方法利用无限维格拉斯曼流形的几何结构来约束CL过程中的状态演化。与经典持续学习方法通过约束权重更新不同,Inf-SSM对其扩展可观测性子空间所编码的SSM无限时域演化进行正则化。我们证明,实施这种正则化需要求解称为西尔维斯特方程的矩阵方程,其典型复杂度为$\mathcal{O}(n^3)$。通过利用SSM的结构与性质,我们开发了一种$\mathcal{O}(n^2)$的求解方案。这形成了可无缝集成到现有CL方法中的高效正则化机制。在包括ImageNet-R和Caltech-256在内的具有挑战性的基准测试上进行的全面实验表明,该方法在显著减少遗忘的同时提升了跨序列任务的准确率。