Controller-Augmented Hidden Markov Models: A Computational Framework for Constrained Sequential Inference

Hidden Markov models are foundational for sequential inference, but their Markovian assumption fails under pathwise constraints such as precedence requirements, visitation cardinalities, or monotonic state progression, which induce long-range dependencies that invalidate standard dynamic programming algorithms. To deal with this, we present Controller-Augmented Hidden Markov Models (CHMMs), a framework that compiles each constraint into a finite-state controller tracking the minimal sufficient history, after which standard forward--backward and Viterbi recursions on the augmented chain compute exact constrained posteriors and maximum a posteriori paths in both discrete and continuous time, the latter through uniformization. We establish four theoretical guarantees: exactness of constrained inference, monotone ascent of constrained EM, inference complexity linear in the controller cardinality, and a total-variation bound under constraint misspecification. A catalog of controller encodings covering 11 constraint families across the ordering, visitation, path, and temporal categories operationalizes the framework. Empirically, we evaluate CHMMs against 6 alternative decoders on 3 real-world sequence-labeling tasks of substantively different character: gene-structure decoding in \emph{Drosophila melanogaster}, free-living activity recognition in CASAS smart-home environments, and protocol-defined human activity recognition from wearable sensors. The results reveal a clean local-versus-cumulative dichotomy in which controller augmentation is uniquely able to recover globally feasible trajectories on cumulative-constraint regimes, whilst simpler decoders are matched in validity on locally-dominated regimes. Together, theory and experiment characterize when exact controller augmentation is necessary and when simpler approaches suffice.

翻译：隐马尔可夫模型是序列推断的基础，但其马尔可夫假设在面对路径级约束（如优先级要求、访问基数或单调状态演进）时失效，这些约束会引发长程依赖性，使得标准动态规划算法不再适用。为解决此问题，我们提出控制器增强隐马尔可夫模型（CHMMs），该框架将每个约束编译为一个追踪充分最小历史的有限状态控制器，随后在增强链上应用标准的前向-后向和维特比递归，即可精确计算离散和连续时间（后者通过均匀化实现）下的约束后验及最大后验路径。我们建立了四项理论保证：约束推断的精确性、约束EM的单调递增性、推断复杂度随控制器基数线性增长、以及约束错误设定下的全变差上界。通过涵盖排序、访问、路径和时间四大类别的11种约束族的控制器编码目录，实现了该框架的操作化。实验部分，我们在三个具有实质差异的真实序列标注任务上，将CHMMs与六种替代解码器进行对比评估：果蝇基因结构解码、CASAS智能家居环境中的自由活动识别、以及基于可穿戴传感器的协议定义人类活动识别。结果显示出一个清晰的局部-累积二分法：在累积约束模式下，控制器增强是唯一能够恢复全局可行轨迹的方法，而在局部主导模式下，简单解码器已能达成有效性。理论与实验共同刻画了何时必须使用精确控制器增强，以及何时简单方法即可满足需求。