Background: Biomedical data are usually collections of longitudinal data assessed at certain points in time. Clinical observations assess the presences and severity of symptoms, which are the basis for description and modeling of disease progression. Deciphering potential underlying unknowns solely from the distinct observation would substantially improve the understanding of pathological cascades. Hidden Markov Models (HMMs) have been successfully applied to the processing of possibly noisy continuous signals. The aim was to improve the application HMMs to multivariate time-series of categorically distributed data. Here, we used HHMs to study prediction of the loss of free walking ability as one major clinical deterioration in the most common autosomal dominantly inherited ataxia disorder worldwide. We used HHMs to investigate the prediction of loss of the ability to walk freely, representing a major clinical deterioration in the most common autosomal-dominant inherited ataxia disorder worldwide. Results: We present a prediction pipeline which processes data paired with a configuration file, enabling to construct, validate and query a fully parameterized HMM-based model. In particular, we provide a theoretical and practical framework for multivariate time-series inference based on HMMs that includes constructing multiple HMMs, each to predict a particular observable variable. Our analysis is done on random data, but also on biomedical data based on Spinocerebellar ataxia type 3 disease. Conclusions: HHMs are a promising approach to study biomedical data that naturally are represented as multivariate time-series. Our implementation of a HHMs framework is publicly available and can easily be adapted for further applications.
翻译:背景:生物医学数据通常是在特定时间点评估的纵向数据集合。临床观察评估症状的存在与严重程度,这是描述和建模疾病进展的基础。仅从离散观测中破译潜在的未知因素,将显著提升对病理级联反应的理解。隐马尔可夫模型已成功应用于处理可能存在噪声的连续信号。本研究旨在改进隐马尔可夫模型对分类分布数据多元时间序列的应用。我们利用隐马尔可夫模型研究全球最常见常染色体显性遗传共济失调疾病中一项主要临床恶化指标——自由行走能力丧失的预测。结果:我们提出一种预测流程,该流程可处理与配置文件配对的数据,从而构建、验证和查询完全参数化的基于隐马尔可夫模型的模型。特别地,我们提供了一套基于隐马尔可夫模型的多元时间序列推断理论与实用框架,该框架包含构建多个隐马尔可夫模型,每个模型用于预测特定可观测变量。我们的分析基于随机数据,同时也基于脊髓小脑性共济失调3型疾病的生物医学数据。结论:隐马尔可夫模型是研究天然呈现为多元时间序列形式的生物医学数据的一种有前景的方法。我们实现的隐马尔可夫模型框架已公开,可便捷地适配于其他应用场景。