Randomized controlled trials (RCTs) are the standard for evaluating the effectiveness of clinical interventions. To address the limitations of RCTs on real-world populations, we developed a methodology that uses a large observational electronic health record (EHR) dataset. Principles of regression discontinuity (rd) were used to derive randomized data subsets to test expert-driven interventions using dynamic Bayesian Networks (DBNs) do-operations. This combined method was applied to a chronic kidney disease (CKD) cohort of more than two million individuals and used to understand the associational and causal relationships of CKD variables with respect to a surrogate outcome of >=40% decline in estimated glomerular filtration rate (eGFR). The associational and causal analyses depicted similar findings across DBNs from two independent healthcare systems. The associational analysis showed that the most influential variables were eGFR, urine albumin-to-creatinine ratio, and pulse pressure, whereas the causal analysis showed eGFR as the most influential variable, followed by modifiable factors such as medications that may impact kidney function over time. This methodology demonstrates how real-world EHR data can be used to provide population-level insights to inform improved healthcare delivery.
翻译:随机对照试验(RCT)是评估临床干预措施有效性的金标准。为克服随机对照试验在真实世界人群研究中的局限性,我们开发了一种利用大规模观察性电子健康记录(EHR)数据集的方法论。该方法运用断点回归(RD)原理,从观察数据中推导出随机化的数据子集,并借助动态贝叶斯网络(DBN)的do-操作来检验专家驱动的干预措施。我们将此组合方法应用于一个包含超过两百万个体的慢性肾病(CKD)队列,旨在理解CKD相关变量与替代结局(估算肾小球滤过率(eGFR)下降≥40%)之间的关联性与因果关系。基于两个独立医疗系统的DBN模型,关联性分析与因果分析得出了相似的结论。关联性分析显示,影响力最大的变量是eGFR、尿白蛋白与肌酐比值以及脉压;而因果分析则表明eGFR是最具影响力的变量,其次是可能随时间影响肾功能的可调控因素(如药物)。该方法论证了如何利用真实世界的EHR数据,为改进医疗服务提供人群层面的洞见。