Many state-of-the-art models trained on long-range sequences, for example S4, S5 or LRU, are made of sequential blocks combining State-Space Models (SSMs) with neural networks. In this paper we provide a PAC bound that holds for these kind of architectures with stable SSM blocks and does not depend on the length of the input sequence. Imposing stability of the SSM blocks is a standard practice in the literature, and it is known to help performance. Our results provide a theoretical justification for the use of stable SSM blocks as the proposed PAC bound decreases as the degree of stability of the SSM blocks increases.
翻译:许多在长程序列上训练的最先进模型,例如S4、S5或LRU,均由结合状态空间模型(SSMs)与神经网络的序列模块构成。本文针对这类包含稳定SSM模块的架构提出了一个PAC界,该界不依赖于输入序列的长度。在文献中,对SSM模块施加稳定性是标准做法,且已知有助于提升性能。我们的结果为采用稳定SSM模块提供了理论依据,因为所提出的PAC界会随着SSM模块稳定程度的增加而减小。