The practical utility of causality in decision-making is widespread and brought about by the intertwining of causal discovery and causal inference. Nevertheless, a notable gap exists in the evaluation of causal discovery methods, where insufficient emphasis is placed on downstream inference. To address this gap, we evaluate seven established baseline causal discovery methods including a newly proposed method based on GFlowNets, on the downstream task of treatment effect estimation. Through the implementation of a distribution-level evaluation, we offer valuable and unique insights into the efficacy of these causal discovery methods for treatment effect estimation, considering both synthetic and real-world scenarios, as well as low-data scenarios. The results of our study demonstrate that some of the algorithms studied are able to effectively capture a wide range of useful and diverse ATE modes, while some tend to learn many low-probability modes which impacts the (unrelaxed) recall and precision.
翻译:因果性在决策中的实际应用十分广泛,这源于因果发现与因果推断之间的交织。然而,在因果发现方法的评估中,存在一个显著空白:对下游推断的关注不足。为填补这一空白,我们评估了七种基准因果发现方法(包括一种基于GFlowNets的新提出方法)在下游治疗效果估计任务中的表现。通过实施分布式评估,我们为这些因果发现方法在治疗效果估计中的有效性提供了独特且有价值的见解,涵盖了合成场景、真实世界场景以及低数据场景。我们的研究结果表明,部分被研究的算法能够有效捕获广泛且多样的有用ATE模式,而另一些算法则倾向于学习许多低概率模式,这影响了(未松弛的)召回率和精确度。