Causal discovery is crucial for causal inference in observational studies, as it can enable the identification of valid adjustment sets (VAS) for unbiased effect estimation. However, global causal discovery is notoriously hard in the nonparametric setting, with exponential time and sample complexity in the worst case. To address this, we propose local discovery by partitioning (LDP): a local causal discovery method that is tailored for downstream inference tasks without requiring parametric and pretreatment assumptions. LDP is a constraint-based procedure that returns a VAS for an exposure-outcome pair under latent confounding, given sufficient conditions. The total number of independence tests performed is worst-case quadratic with respect to the cardinality of the variable set. Asymptotic theoretical guarantees are numerically validated on synthetic graphs. Adjustment sets from LDP yield less biased and more precise average treatment effect estimates than baseline discovery algorithms, with LDP outperforming on confounder recall, runtime, and test count for VAS discovery. Notably, LDP ran at least 1300x faster than baselines on a benchmark.
翻译:因果发现对于观察性研究中的因果推断至关重要,因为它能够识别有效的调整集,从而实现无偏效应估计。然而,在非参数设定下,全局因果发现极为困难,最坏情况下具有指数级的时间与样本复杂度。为此,我们提出基于分区的局部发现方法:这是一种面向下游推断任务的局部因果发现方法,无需参数化与预处理假设。LDP是一种基于约束的流程,在给定充分条件下,能够针对存在潜在混杂的暴露-结果对返回有效的调整集。该方法所需执行的独立性检验总数在最坏情况下与变量集基数呈平方关系。我们在合成图数据上对渐近理论保证进行了数值验证。与基线发现算法相比,LDP得到的调整集能够产生偏差更小、精度更高的平均处理效应估计,并在混杂因子召回率、运行时间及有效调整集发现的检验次数方面均优于基线方法。值得注意的是,在基准测试中LDP的运行速度比基线方法快至少1300倍。