Causal discovery methods can identify valid adjustment sets for causal effect estimation for a pair of target variables, even when the underlying causal graph is unknown. Global causal discovery methods focus on learning the whole causal graph and therefore enable the recovery of optimal adjustment sets, i.e., sets with the lowest asymptotic variance, but they quickly become computationally prohibitive as the number of variables grows. Local causal discovery methods offer a more scalable alternative by focusing on the local neighborhood of the target variables, but are restricted to statistically suboptimal adjustment sets. In this work, we propose Local Optimal Adjustments Discovery (LOAD), a sound and complete causal discovery approach that combines the computational efficiency of local methods with the statistical optimality of global methods. First, LOAD identifies the causal relation between the targets and tests if the causal effect is identifiable by using only local information. If it is identifiable, it finds the possible descendants of the treatment and infers the optimal adjustment set as the parents of the outcome in a modified forbidden projection. Otherwise, it returns the locally valid parent adjustment sets. In our experiments on synthetic and realistic data LOAD outperforms global methods in scalability, while providing more accurate effect estimation than local methods.
翻译:因果发现方法可以在底层因果图未知的情况下,为两个目标变量对之间的因果效应估计识别有效的调整集。全局因果发现方法专注于学习整个因果图,因此能够恢复最优调整集(即渐近方差最低的集合),但随着变量数量的增长,其计算开销迅速变得难以承受。局部因果发现方法通过聚焦目标变量的局部邻域,提供了一种更可扩展的替代方案,但仅限于统计次优的调整集。在本工作中,我们提出了局部最优调整发现(LOAD),这是一种完备且完整的因果发现方法,融合了局部方法的计算效率与全局方法的统计最优性。首先,LOAD识别目标变量间的因果关系,并仅利用局部信息检验因果效应是否可识别。若可识别,则找出处理的可能后代节点,并在修正的禁止投影中推断以结果变量的父节点构成的最优调整集。否则,返回局部有效的父节点调整集。在合成数据与真实数据的实验中,LOAD在可扩展性上优于全局方法,同时提供了比局部方法更准确的效应估计。