In clinical practice, one often needs to identify whether a patient is at high risk of adverse outcomes after some key medical event. For example, quantifying the risk of adverse outcomes after an acute cardiovascular event helps healthcare providers identify those patients at the highest risk of poor outcomes; i.e., patients who benefit from invasive therapies that can lower their risk. Assessing the risk of adverse outcomes, however, is challenging due to the complexity, variability, and heterogeneity of longitudinal medical data, especially for individuals suffering from chronic diseases like heart failure. In this paper, we introduce Event-Based Contrastive Learning (EBCL) - a method for learning embeddings of heterogeneous patient data that preserves temporal information before and after key index events. We demonstrate that EBCL can be used to construct models that yield improved performance on important downstream tasks relative to other pretraining methods. We develop and test the method using a cohort of heart failure patients obtained from a large hospital network and the publicly available MIMIC-IV dataset consisting of patients in an intensive care unit at a large tertiary care center. On both cohorts, EBCL pretraining yields models that are performant with respect to a number of downstream tasks, including mortality, hospital readmission, and length of stay. In addition, unsupervised EBCL embeddings effectively cluster heart failure patients into subgroups with distinct outcomes, thereby providing information that helps identify new heart failure phenotypes. The contrastive framework around the index event can be adapted to a wide array of time-series datasets and provides information that can be used to guide personalized care.
翻译:在临床实践中,我们常常需要判断患者在经历某些关键医学事件后是否处于不良结局的高风险状态。例如,量化急性心血管事件后不良结局的风险,有助于医疗工作者识别那些最可能发生不良结局的患者,即那些能从降低风险的侵入性治疗中获益的患者。然而,由于纵向医学数据的复杂性、变异性和异质性,评估不良结局风险极具挑战性,尤其对于心力衰竭等慢性疾病患者而言。本文提出了一种基于事件的对比学习(EBCL)方法,用于学习异质性患者数据的嵌入表示,该方法能够保留关键指标事件前后的时序信息。我们证明,相较于其他预训练方法,EBCL可用于构建模型,在重要的下游任务中取得更优性能。我们利用从大型医院网络获取的心力衰竭患者队列以及公开的MIMIC-IV数据集(包含某大型三级医疗中心重症监护室患者数据)对该方法进行了开发与测试。在两个队列中,EBCL预训练所得模型在死亡率、再入院率和住院时长等多个下游任务中均表现出色。此外,无监督的EBCL嵌入表示能有效将心力衰竭患者聚类为具有不同结局的亚组,从而提供有助于识别新型心力衰竭表型的信息。这种围绕指标事件的对比学习框架可适用于广泛的时序数据集,并能为个性化诊疗提供指导信息。