Infinite hidden Markov models provide a flexible framework for modeling time-series with structural changes and complex dynamics, without requiring the number of latent states to be specified in advance. This flexibility is achieved through the hierarchical Dirichlet process prior, while efficient Bayesian inference is enabled by the beam sampler, which combines dynamic programming with slice sampling to truncate the infinite state space adaptively. Despite extensive methodological developments, the role of initialization in this framework has received limited attention. This gap is addressed by systematically evaluating initialization strategies commonly used for finite hidden Markov models and assessing their suitability in the infinite setting. Results from both simulated and real datasets show that distance-based clustering initializations consistently outperform model-based and uniform alternatives, the latter being the most widely adopted in the existing literature.
翻译:无限隐马尔可夫模型为建模具有结构变化和复杂动态特性的时间序列提供了灵活框架,无需预先指定潜在状态数量。这种灵活性通过层次狄利克雷过程先验实现,而有效贝叶斯推断则借助波束采样器完成——该采样器将动态规划与切片采样相结合,自适应截断无限状态空间。尽管方法论多有发展,但初始化在该框架中的作用仍鲜受关注。本文通过系统评估有限隐马尔可夫模型常用初始化策略,并检验其在无限设定下的适用性,填补了这一研究空白。仿真与真实数据结果均表明,基于距离的聚类初始化始终优于基于模型和均匀初始化方法——后者是现有文献中应用最广泛的选择。