State Space Models (SSMs) and Hidden Markov Models (HMMs) are foundational frameworks for modeling sequential data with latent variables and are widely used in signal processing, control theory, and machine learning. Despite their shared temporal structure, they differ fundamentally in the nature of their latent states, probabilistic assumptions, inference procedures, and training paradigms. Recently, deterministic state space models have re-emerged in natural language processing through architectures such as S4 and Mamba, raising new questions about the relationship between classical probabilistic SSMs, HMMs, and modern neural sequence models. In this paper, we present a unified and systematic comparison of HMMs, linear Gaussian state space models, Kalman filtering, and contemporary NLP state space models. We analyze their formulations through the lens of probabilistic graphical models, examine their inference algorithms -- including forward-backward inference and Kalman filtering -- and contrast their learning procedures via Expectation-Maximization and gradient-based optimization. By highlighting both structural similarities and semantic differences, we clarify when these models are equivalent, when they fundamentally diverge, and how modern NLP SSMs relate to classical probabilistic models. Our analysis bridges perspectives from control theory, probabilistic modeling, and modern deep learning.
翻译:状态空间模型(SSMs)与隐马尔可夫模型(HMMs)是建模具有隐变量的序列数据的基础框架,广泛应用于信号处理、控制理论和机器学习领域。尽管它们具有相似的时间结构,但在隐状态性质、概率假设、推断过程以及训练范式方面存在根本性差异。近期,确定性状态空间模型通过S4和Mamba等架构在自然语言处理领域重新兴起,这引发了关于经典概率SSMs、HMMs与现代神经序列模型之间关系的新问题。本文对HMMs、线性高斯状态空间模型、卡尔曼滤波以及当代NLP状态空间模型进行了统一而系统的比较。我们通过概率图模型的视角分析其数学表述,研究其推断算法——包括前向后向推断与卡尔曼滤波——并对比其通过期望最大化与基于梯度的优化的学习过程。通过强调结构相似性与语义差异性,我们阐明了这些模型在何种条件下等价、在何处存在根本分歧,以及现代NLP SSMs如何与经典概率模型相关联。我们的分析融合了控制理论、概率建模与现代深度学习的多重视角。