Identifying causal effects is a key problem of interest across many disciplines. The two long-standing approaches to estimate causal effects are observational and experimental (randomized) studies. Observational studies can suffer from unmeasured confounding, which may render the causal effects unidentifiable. On the other hand, direct experiments on the target variable may be too costly or even infeasible to conduct. A middle ground between these two approaches is to estimate the causal effect of interest through proxy experiments, which are conducted on variables with a lower cost to intervene on compared to the main target. Akbari et al. [2022] studied this setting and demonstrated that the problem of designing the optimal (minimum-cost) experiment for causal effect identification is NP-complete and provided a naive algorithm that may require solving exponentially many NP-hard problems as a sub-routine in the worst case. In this work, we provide a few reformulations of the problem that allow for designing significantly more efficient algorithms to solve it as witnessed by our extensive simulations. Additionally, we study the closely-related problem of designing experiments that enable us to identify a given effect through valid adjustments sets.
翻译:因果效应识别是众多学科关注的核心问题。长期以来,估计因果效应的两种主要方法是观测性研究与实验性(随机化)研究。观测性研究可能受到未测量混杂因素的影响,导致因果效应无法识别。另一方面,对目标变量进行直接实验可能成本过高甚至不可行。介于这两种方法之间的折中方案是通过代理实验来估计目标因果效应,即对干预成本低于主要目标的变量进行实验。Akbari等人[2022]研究了这一设定,证明了为识别因果效应而设计最优(最小成本)实验的问题是NP完全问题,并提出了一种朴素算法,该算法在最坏情况下可能需要求解指数级数量的NP难问题作为子程序。本工作中,我们提出了该问题的若干重构形式,使得设计更高效算法成为可能,这一点通过我们的大量仿真实验得到了验证。此外,我们还研究了与之密切相关的实验设计问题:如何通过有效的调整集来识别给定效应。