Understanding causal relationships is critical for healthcare. Accurate causal models provide a means to enhance the interpretability of predictive models, and furthermore a basis for counterfactual and interventional reasoning and the estimation of treatment effects. However, would-be practitioners of causal discovery face a dizzying array of algorithms without a clear best choice. This abundance of competitive algorithms makes ensembling a natural choice for practical applications. At the same time, real-world use cases frequently face challenges that violate the assumptions of common causal discovery algorithms, forcing heavy reliance on expert knowledge. Inspired by recent work on dynamically requested expert knowledge and LLMs as experts, we present a flexible model averaging method leveraging dynamically requested expert knowledge to ensemble a diverse array of causal discovery algorithms. Experiments demonstrate the efficacy of our method with imperfect experts such as LLMs on both clean and noisy data. We also analyze the impact of different degrees of expert correctness and assess the capabilities of LLMs for clinical causal discovery, providing valuable insights for practitioners.
翻译:理解因果关系对医疗健康领域至关重要。精确的因果模型不仅能提升预测模型的可解释性,更为反事实与干预推理以及治疗效果估计提供了基础。然而,因果发现的实践者面临着众多算法选择却缺乏明确的最佳方案。这种竞争性算法的丰富性使得集成方法成为实际应用的自然选择。与此同时,现实应用场景常面临违反常见因果发现算法假设的挑战,迫使研究者高度依赖专家知识。受近期关于动态请求专家知识及将大语言模型作为专家的研究启发,我们提出了一种灵活的模型平均方法,利用动态请求的专家知识集成多种因果发现算法。实验证明,我们的方法在使用大语言模型等非完美专家的场景下,在干净及含噪数据上均表现有效。我们还分析了专家正确性不同程度的影响,并评估了大语言模型在临床因果发现中的能力,为实践者提供了有价值的见解。