The practical utility of causality in decision-making is widespread and brought about by the intertwining of causal discovery and causal inference. Nevertheless, a notable gap exists in the evaluation of causal discovery methods, where insufficient emphasis is placed on downstream inference. To address this gap, we evaluate seven established baseline causal discovery methods including a newly proposed method based on GFlowNets, on the downstream task of treatment effect estimation. Through the implementation of a distribution-level evaluation, we offer valuable and unique insights into the efficacy of these causal discovery methods for treatment effect estimation, considering both synthetic and real-world scenarios, as well as low-data scenarios. The results of our study demonstrate that some of the algorithms studied are able to effectively capture a wide range of useful and diverse ATE modes, while some tend to learn many low-probability modes which impacts the (unrelaxed) recall and precision.
翻译:因果性在决策中的实用价值广泛存在,这源于因果发现与因果推断的相互交织。然而,在因果发现方法的评估中存在一个显著空白:对下游推断的重视不足。针对这一空白,我们评估了七种成熟的基线因果发现方法(包括一种基于GFlowNets的新方法),并将其应用于下游任务——治疗效果估计。通过实施分布级评估,我们为这些因果发现方法在治疗效果估计中的效能提供了独特且有价值的见解,涵盖了合成数据与真实场景,以及低数据场景。研究结果表明,部分算法能够有效捕获广泛且多样化的实用ATE模式,而另一些算法则倾向于学习大量低概率模式,这影响了(非松弛的)召回率与精确度。