Multistate Markov models are a canonical parametric approach for data modeling of observed or latent stochastic processes supported on a finite state space. Continuous-time Markov processes describe data that are observed irregularly over time, as is often the case in longitudinal medical data, for example. Assuming that a continuous-time Markov process is time-homogeneous, a closed-form likelihood function can be derived from the Kolmogorov forward equations -- a system of differential equations with a well-known matrix-exponential solution. Unfortunately, however, the forward equations do not admit an analytical solution for continuous-time, time-inhomogeneous Markov processes, and so researchers and practitioners often make the simplifying assumption that the process is piecewise time-homogeneous. In this paper, we provide intuitions and illustrations of the potential biases for parameter estimation that may ensue in the more realistic scenario that the piecewise-homogeneous assumption is violated, and we advocate for a solution for likelihood computation in a truly time-inhomogeneous fashion. Particular focus is afforded to the context of multistate Markov models that allow for state label misclassifications, which applies more broadly to hidden Markov models (HMMs), and Bayesian computations bypass the necessity for computationally demanding numerical gradient approximations for obtaining maximum likelihood estimates (MLEs). Supplemental materials are available online.
翻译:多状态马尔可夫模型是用于有限状态空间上观测或潜在随机过程数据建模的一种经典参数化方法。连续时间马尔可夫过程描述随时间不规则观测的数据,例如纵向医学数据中常见的情况。若假设连续时间马尔可夫过程具有时间齐性,则可通过Kolmogorov前向方程——一个具有已知矩阵指数解系的微分方程组——推导出封闭形式的似然函数。然而遗憾的是,对于连续时间非齐次马尔可夫过程,前向方程并不存在解析解,因此研究者与实践者常采用分段时间齐性的简化假设。本文通过直观阐释与示例,揭示了在更现实的违反分段齐性假设场景下可能产生的参数估计偏误,并倡导采用真正非齐次时间模式下的似然计算方法。研究特别关注允许状态标签误分类的多状态马尔可夫模型情境——该框架可更广泛地应用于隐马尔可夫模型(HMMs),且贝叶斯计算规避了为获得最大似然估计(MLEs)而进行计算密集型数值梯度近似的需求。补充材料已在线发布。