Causal inference for testing clinical hypotheses from observational data presents many difficulties because the underlying data-generating model and the associated causal graph are not usually available. Furthermore, observational data may contain missing values, which impact the recovery of the causal graph by causal discovery algorithms: a crucial issue often ignored in clinical studies. In this work, we use data from a multi-centric study on endometrial cancer to analyze the impact of different missingness mechanisms on the recovered causal graph. This is achieved by extending state-of-the-art causal discovery algorithms to exploit expert knowledge without sacrificing theoretical soundness. We validate the recovered graph with expert physicians, showing that our approach finds clinically-relevant solutions. Finally, we discuss the goodness of fit of our graph and its consistency from a clinical decision-making perspective using graphical separation to validate causal pathways.
翻译:从观测数据中进行临床假设检验的因果推断面临诸多困难,因为底层数据生成模型及其关联的因果图通常不可得。此外,观测数据可能包含缺失值,这会影响因果发现算法对因果图的恢复——这一关键问题在临床研究中常被忽视。在本工作中,我们利用一项关于子宫内膜癌的多中心研究数据,分析了不同缺失机制对恢复所得因果图的影响。通过扩展现有最优的因果发现算法以融入专家知识(且不牺牲理论严谨性),我们实现了这一目标。我们与临床专家共同验证了所恢复的因果图,结果表明我们的方法能找到具有临床相关性的解决方案。最后,我们利用图分离方法验证因果路径,从临床决策角度讨论了所得因果图的拟合优度及其一致性。