Objective: Sepsis is one of the most serious hospital conditions associated with high mortality. Sepsis is the result of a dysregulated immune response to infection that can lead to multiple organ dysfunction and death. Due to the wide variability in the causes of sepsis, clinical presentation, and the recovery trajectories, identifying sepsis sub-phenotypes is crucial to advance our understanding of sepsis characterization, to choose targeted treatments and optimal timing of interventions, and to improve prognostication. Prior studies have described different sub-phenotypes of sepsis using organ-specific characteristics. These studies applied clustering algorithms to electronic health records (EHRs) to identify disease sub-phenotypes. However, prior approaches did not capture temporal information and made uncertain assumptions about the relationships among the sub-phenotypes for clustering procedures. Methods: We developed a time-aware soft clustering algorithm guided by clinical variables to identify sepsis sub-phenotypes using data available in the EHR. Results: We identified six novel sepsis hybrid sub-phenotypes and evaluated them for medical plausibility. In addition, we built an early-warning sepsis prediction model using logistic regression. Conclusion: Our results suggest that these novel sepsis hybrid sub-phenotypes are promising to provide more accurate information on sepsis-related organ dysfunction and sepsis recovery trajectories which can be important to inform management decisions and sepsis prognosis.
翻译:目的:脓毒症是医院中最严重的高死亡率疾病之一,其本质是感染引发的免疫应答失调,可导致多器官功能障碍甚至死亡。由于脓毒症的病因、临床表现及恢复轨迹存在显著异质性,识别其亚表型对于深化对脓毒症特征的理解、选择靶向治疗策略与最佳干预时机、以及改善预后预测具有重要意义。既往研究通过器官特异性特征描述了不同的脓毒症亚表型,这些研究应用聚类算法对电子健康记录(EHR)数据进行疾病亚表型识别。然而,现有方法未能捕捉时间维度信息,且对亚表型间关系做出了不确定的聚类假设。方法:我们开发了一种由临床变量引导的时间感知软聚类算法,利用EHR中的可用数据识别脓毒症亚表型。结果:识别出六种新型脓毒症混合亚表型,并对其医学合理性进行了评估。此外,基于逻辑回归构建了脓毒症早期预警预测模型。结论:研究结果表明,这些新型脓毒症混合亚表型有望提供更精准的脓毒症相关器官功能障碍及恢复轨迹信息,这对指导治疗决策与脓毒症预后评估具有重要价值。