We investigate the expressive power of state space models (SSM), which have recently emerged as a potential alternative to transformer architectures in large language models. Building on recent work, we analyse SSM expressiveness through fragments and extensions of linear temporal logic over finite traces. Our results show that the expressive capabilities of SSM vary substantially depending on the underlying gating mechanism. We further distinguish between SSM operating over fixed-width arithmetic (quantised models), whose expressive power remains within regular languages, and SSM with unbounded precision, which can capture counting properties and non-regular languages. In addition, we provide a systematic comparison between these different SSM variants and known results on transformers, thereby clarifying how the two architectures relate in terms of expressive power.
翻译:本文研究了状态空间模型(SSM)的表达能力,该模型近期作为Transformer架构的潜在替代方案在大语言模型中兴起。基于近期研究成果,我们通过有限迹上的线性时序逻辑片段及其扩展来分析SSM的表达能力。研究结果表明,SSM的表达能力因其底层门控机制的不同而存在显著差异。我们进一步区分了在固定位宽算术(量化模型)上运行的SSM(其表达能力仍限于正则语言范畴)与具有无限精度的SSM(后者能够捕获计数性质与非正则语言)。此外,我们对这些不同SSM变体与Transformer的已知研究成果进行了系统比较,从而阐明这两种架构在表达能力层面的关联性。