Causal discovery methods can identify valid adjustment sets for causal effect estimation for a pair of target variables, even when the underlying causal graph is unknown. Global causal discovery methods focus on learning the whole causal graph and therefore enable the recovery of optimal adjustment sets, i.e., sets with the lowest asymptotic variance, but they quickly become computationally prohibitive as the number of variables grows. Local causal discovery methods offer a more scalable alternative by focusing on the local neighborhood of the target variables, but are restricted to statistically suboptimal adjustment sets. In this work, we propose Local Optimal Adjustments Discovery (LOAD), a sound and complete causal discovery approach that combines the computational efficiency of local methods with the statistical optimality of global methods. First, LOAD identifies the causal relation between the targets and tests if the causal effect is identifiable by using only local information. If it is identifiable, it then finds the optimal adjustment set by leveraging local causal discovery to infer the mediators and their parents. Otherwise, it returns the locally valid parent adjustment sets based on the learned local structure. In our experiments on synthetic and realistic data LOAD outperforms global methods in scalability, while providing more accurate effect estimation than local methods.
翻译:因果发现方法能够为因果效应估计识别有效的调整集,即使底层因果图未知。全局因果发现方法侧重于学习整个因果图,因此能够恢复最优调整集(即具有最低渐近方差的集合),但随着变量数量的增加,其计算复杂度会迅速变得难以承受。局部因果发现方法通过聚焦于目标变量的局部邻域,提供了更具可扩展性的替代方案,但仅限于统计次优的调整集。本文提出局部最优调整发现方法(LOAD),这是一种完备且可靠的因果发现方法,它结合了局部方法的计算效率与全局方法的统计最优性。首先,LOAD识别目标变量间的因果关系,并仅利用局部信息检验因果效应是否可识别。若可识别,则通过利用局部因果发现推断中介变量及其父节点,进而找到最优调整集。否则,该方法将基于学习到的局部结构返回局部有效的父节点调整集。在合成数据与真实数据上的实验表明,LOAD在可扩展性方面优于全局方法,同时能提供比局部方法更准确的效应估计。