We consider the problem of computing bounds for causal queries on causal graphs with unobserved confounders and discrete valued observed variables, where identifiability does not hold. Existing non-parametric approaches for computing such bounds use linear programming (LP) formulations that quickly become intractable for existing solvers because the size of the LP grows exponentially in the number of edges in the causal graph. We show that this LP can be significantly pruned, allowing us to compute bounds for significantly larger causal inference problems compared to existing techniques. This pruning procedure allows us to compute bounds in closed form for a special class of problems, including a well-studied family of problems where multiple confounded treatments influence an outcome. We extend our pruning methodology to fractional LPs which compute bounds for causal queries which incorporate additional observations about the unit. We show that our methods provide significant runtime improvement compared to benchmarks in experiments and extend our results to the finite data setting. For causal inference without additional observations, we propose an efficient greedy heuristic that produces high quality bounds, and scales to problems that are several orders of magnitude larger than those for which the pruned LP can be solved.
翻译:我们考虑在存在未观测混杂因子和离散值观测变量的因果图中计算因果查询界限的问题,此时可识别性不成立。现有的用于计算此类界限的非参数方法使用线性规划(LP)公式,但这些公式对现有求解器而言很快变得难以处理,因为线性规划的规模随因果图中边数的增加呈指数级增长。我们证明,这种线性规划可以显著精简,从而使我们能够比现有技术计算更大规模因果推断问题的界限。这种精简过程使我们能够对一类特殊问题以闭式形式计算界限,其中包括一个被广泛研究的系列问题:多个混杂处理变量影响一个结果变量。我们将精简方法扩展到分数线性规划,这类规划为包含关于单元的额外观测的因果查询计算界限。我们证明,在实验中,与基准方法相比,我们的方法提供了显著的运行时间改进,并将结果扩展到有限数据场景。对于没有额外观测的因果推断,我们提出了一种高效的贪心启发式方法,该方法能产生高质量的界限,并扩展到比精简线性规划可解问题大数个数量级的问题。