In this paper, we investigate the long-term memory learning capabilities of state-space models (SSMs) from the perspective of parameterization. We prove that state-space models without any reparameterization exhibit a memory limitation similar to that of traditional RNNs: the target relationships that can be stably approximated by state-space models must have an exponential decaying memory. Our analysis identifies this ``curse of memory'' as a result of the recurrent weights converging to a stability boundary, suggesting that a reparameterization technique can be effective. To this end, we introduce a class of reparameterization techniques for SSMs that effectively lift its memory limitations. Besides improving approximation capabilities, we further illustrate that a principled choice of reparameterization scheme can also enhance optimization stability. We validate our findings using synthetic datasets, language models and image classifications.
翻译:本文从参数化角度研究了状态空间模型(SSMs)的长期记忆学习能力。我们证明,未经任何重参数化的状态空间模型存在与传统RNN类似的记忆限制:可被状态空间模型稳定逼近的目标关系必须具有指数衰减的记忆特性。我们的分析将这种“记忆诅咒”归因于递归权重收敛到稳定性边界,这表明重参数化技术是有效的。为此,我们引入一类能有效解除SSMs记忆限制的重参数化技术。除改善逼近能力外,我们进一步阐明,合理的重参数化方案选择还能优化稳定性。我们通过合成数据集、语言模型和图像分类验证了研究结果。