Hidden Markov models (HMMs) are powerful tools for analysing time series data that depend on discrete underlying but unobserved states. As such, they have gained prominence across numerous empirical disciplines, in particular ecology, medicine, and economics. However, the increasing complexity of empirical data is often accompanied by additional latent structure such as spatial effects, temporal trends, or measurement perturbations. Gaussian fields provide an attractive building block for incorporating such structured latent variation into HMMs. Fast inference methods for Gaussian fields have emerged through the stochastic partial differential equation (SPDE) approach. Due to their sparse representation, these integrate well with novel frequentist estimation methods for random-effects models via the use of automatic differentiation and the Laplace approximation. Scaling to high dimensions requires tools such as (R)TMB to exploit sparsity in the Hessian w.r.t. the latent variables - a property satisfied by SPDE fields but violated by the HMM likelihood. We present a modified forward algorithm to compute the HMM likelihood, constructing sparsity in the Hessian and consequently enabling fast and scalable inference. We demonstrate the practical feasibility and the usefulness through simulations and two case studies exploring the detection of stellar flares as well as modelling the movement of lions.
翻译:隐马尔可夫模型(HMMs)是分析依赖于离散潜在未观测状态的时间序列数据的有力工具。因此,该模型在众多实证学科中获得了广泛应用,特别是在生态学、医学和经济学领域。然而,实证数据日益增长的复杂性往往伴随着额外的潜在结构,如空间效应、时间趋势或测量扰动。高斯场为将此类结构化潜在变异纳入HMMs提供了一个极具吸引力的构建模块。通过随机偏微分方程(SPDE)方法,已发展出针对高斯场的快速推断方法。得益于其稀疏表示特性,这些方法能够很好地与基于自动微分和拉普拉斯近似的随机效应模型新型频率学派估计方法相结合。要实现高维扩展,需要借助(R)TMB等工具来利用关于潜在变量的海森矩阵的稀疏性——这一性质为SPDE场所满足,但被HMM似然函数所违背。我们提出一种改进的前向算法来计算HMM似然函数,从而在海森矩阵中构建稀疏性,最终实现快速可扩展的推断。我们通过仿真实验以及两个案例研究(探索恒星耀斑的检测与狮子运动建模)证明了该方法的实际可行性与实用价值。