Traditional comorbidity scores (e.g., Charlson and Elixhauser) are widely used for risk adjustment and patient stratification, but they have two key limitations: (i) they are largely mortality-centric and do not align well with other clinical outcomes, and (ii) their linear, rule-based structure cannot capture nonlinear, outcome-specific risk relationships. We propose a Machine-Learned Comorbidity Index (MLCI) that maps diagnosis codes to a single scalar by maximizing the normalized Hilbert-Schmidt Independence Criterion (nHSIC) between the learned score and multiple clinical outcomes. MLCI captures nonlinear risk-outcome dependence and is supported by a theory that characterizes when a unified, informative admission-level ordering can be achieved across outcomes. Empirical results on multiple benchmark electronic health record (EHR) datasets show that MLCI outperforms strong baselines across multiple evaluation metrics.
翻译:传统的共病评分(例如Charlson和Elixhauser评分)广泛用于风险调整和患者分层,但它们存在两个关键局限性:(i)它们主要以死亡率为中心,与其他临床结局的吻合度较差;(ii)其线性的、基于规则的框架无法捕捉非线性的、特定于结局的风险关系。我们提出了一种机器学习共病指数(MLCI),该指数通过最大化学习评分与多个临床结局之间的归一化希尔伯特-施密特独立性准则(nHSIC),将诊断代码映射为单一标量值。MLCI能够捕捉风险与结局之间的非线性依赖关系,并且其理论基础刻画了何时能够在不同结局之间实现统一的、信息丰富的入院层级排序。在多个基准电子健康记录(EHR)数据集上的实证结果表明,MLCI在多项评估指标上均优于强基线模型。