In this paper, we propose a deep generative time series approach using latent temporal processes for modeling and holistically analyzing complex disease trajectories. We aim to find meaningful temporal latent representations of an underlying generative process that explain the observed disease trajectories in an interpretable and comprehensive way. To enhance the interpretability of these latent temporal processes, we develop a semi-supervised approach for disentangling the latent space using established medical concepts. By combining the generative approach with medical knowledge, we leverage the ability to discover novel aspects of the disease while integrating medical concepts into the model. We show that the learned temporal latent processes can be utilized for further data analysis and clinical hypothesis testing, including finding similar patients and clustering the disease into new sub-types. Moreover, our method enables personalized online monitoring and prediction of multivariate time series including uncertainty quantification. We demonstrate the effectiveness of our approach in modeling systemic sclerosis, showcasing the potential of our machine learning model to capture complex disease trajectories and acquire new medical knowledge.
翻译:本文提出了一种基于潜时间过程的深度生成时间序列方法,用于对复杂疾病轨迹进行建模与全局分析。我们旨在从可解释且全面的角度,找到能够解释观测疾病轨迹的、具有意义的潜在时间生成过程表征。为增强这些潜时间过程的可解释性,我们开发了一种基于既定医学概念的半监督方法,以实现潜空间的解耦。通过将生成方法与医学知识相结合,我们既能够将医学概念融入模型,又能够发挥发现疾病新特征的能力。研究表明,学习得到的时间潜过程可用于进一步的数据分析与临床假说检验,包括寻找相似患者以及将疾病划分为新的亚型。此外,我们的方法还能实现对多元时间序列的个性化在线监测与预测,并包含不确定性量化。我们通过系统性硬化症建模验证了该方法的有效性,展示了机器学习模型在捕获复杂疾病轨迹及获取新医学知识方面的潜力。