Using observational data to learn causal relationships is essential when randomized experiments are not possible, such as in healthcare. Discovering causal relationships in time-series health data is even more challenging when relationships change over the course of a disease, such as medications that are most effective early on or for individuals with severe disease. Stage variables such as weeks of pregnancy, disease stages, or biomarkers like HbA1c, can influence what causal relationships are true for a patient. However, causal inference within each stage is often not possible due to limited amounts of data, and combining all data risks incorrect or missed inferences. To address this, we propose Causal Discovery with Stage Variables (CDSV), which uses stage variables to reweight data from multiple time-series while accounting for different causal relationships in each stage. In simulated data, CDSV discovers more causes with fewer false discoveries compared to baselines, in eICU it has a lower FDR than baselines, and in MIMIC-III it discovers more clinically relevant causes of high blood pressure.
翻译:基于观察数据学习因果关系在无法进行随机实验(如医疗保健领域)时至关重要。当疾病进程中因果关系发生变化(例如早期最有效的药物,或重症患者个体差异)时,从时间序列健康数据中发现因果关系更具挑战性。阶段变量(如孕周、疾病阶段或糖化血红蛋白等生物标志物)可能影响患者真实存在的因果关系。然而,由于各阶段数据量有限,通常无法在每个阶段内进行因果推断,而合并所有数据又可能导致错误或遗漏推断。为解决这一问题,我们提出基于阶段变量的因果发现方法(CDSV),该方法利用阶段变量对多个时间序列数据进行重加权,同时考虑各阶段不同的因果关系。在模拟数据中,CDSV相比基线方法以更少的假阳性发现更多因果关系;在eICU数据中,其假发现率低于基线方法;在MIMIC-III数据中,其发现了更多与高血压临床相关的病因。