We develop a predictive-first optimisation framework for streaming hidden Markov models. Unlike classical approaches that prioritise full posterior recovery under a fully specified generative model, we assume access to regime-specific predictive models whose parameters are learned online while maintaining a fixed transition prior over regimes. Our objective is to sequentially identify latent regimes while maintaining accurate step-ahead predictive distributions. Because the number of possible regime paths grows exponentially, exact filtering is infeasible. We therefore formulate streaming inference as a constrained projection problem in predictive-distribution space: under a fixed hypothesis budget, we approximate the full posterior predictive by the forward-KL optimal mixture supported on $S$ paths. The solution is the renormalised top-$S$ posterior-weighted mixture, providing a principled derivation of beam search for HMMs. The resulting algorithm is fully recursive and deterministic, performing beam-style truncation with closed-form predictive updates and requiring neither EM nor sampling. Empirical comparisons against Online EM and Sequential Monte Carlo under matched computational budgets demonstrate competitive prequential performance.
翻译:我们针对流式隐马尔可夫模型提出了一种预测优先的优化框架。与优先在完全指定的生成模型下恢复完整后验分布的传统方法不同,我们假设可以访问与状态(regime)相关的预测模型,这些模型的参数在线学习,同时保持状态之间的固定转移先验。我们的目标是顺序地识别潜在状态,同时保持准确的单步超前预测分布。由于可能的状态路径数量呈指数增长,精确滤波不可行。因此,我们将流式推断表述为预测分布空间中的约束投影问题:在固定假设预算下,我们通过前向KL散度最优的、由S条路径支撑的混合分布来逼近完整的后验预测分布。其解是重新归一化的前S个后验加权混合分布,这为隐马尔可夫模型中的束搜索提供了原则性推导。所得算法是完全递归且确定性的,执行束式截断并具有闭式预测更新,既不需要期望最大化(EM)也不需要采样。在与匹配计算预算下的在线EM和序贯蒙特卡罗方法进行的经验比较中,该算法展现出有竞争力的贯序预测性能。