Causal effect estimation from observational data is a fundamental task in empirical sciences. It becomes particularly challenging when unobserved confounders are involved in a system. This paper focuses on front-door adjustment -- a classic technique which, using observed mediators allows to identify causal effects even in the presence of unobserved confounding. While the statistical properties of the front-door estimation are quite well understood, its algorithmic aspects remained unexplored for a long time. Recently, Jeong, Tian, and Barenboim [NeurIPS 2022] have presented the first polynomial-time algorithm for finding sets satisfying the front-door criterion in a given directed acyclic graph (DAG), with an $O(n^3(n+m))$ run time, where $n$ denotes the number of variables and $m$ the number of edges of the causal graph. In our work, we give the first linear-time, i.e., $O(n+m)$, algorithm for this task, which thus reaches the asymptotically optimal time complexity. This result implies an $O(n(n+m))$ delay enumeration algorithm of all front-door adjustment sets, again improving previous work by Jeong et al. by a factor of $n^3$. Moreover, we provide the first linear-time algorithm for finding a minimal front-door adjustment set. We offer implementations of our algorithms in multiple programming languages to facilitate practical usage and empirically validate their feasibility, even for large graphs.
翻译:从观测数据中估计因果效应是实证科学中的一项基本任务。当系统中涉及未观测混杂因素时,这一任务变得尤为具有挑战性。本文聚焦于前门调整——一种经典技术,它利用观测到的中介变量,即使在存在未观测混杂的情况下也能识别因果效应。尽管前门估计的统计性质已得到充分理解,但其算法方面长期以来未得到探索。近期,Jeong、Tian和Barenboim [NeurIPS 2022] 提出了首个多项式时间算法,用于在给定有向无环图(DAG)中寻找满足前门准则的集合,其运行时间复杂度为 $O(n^3(n+m))$,其中 $n$ 表示因果图中的变量数量,$m$ 表示边的数量。在我们的工作中,我们为此任务提出了首个线性时间(即 $O(n+m)$)算法,从而达到了渐进最优的时间复杂度。这一结果意味着可以以 $O(n(n+m))$ 的延迟枚举所有前门调整集,再次将Jeong等人的先前工作改进了 $n^3$ 倍。此外,我们提供了首个用于寻找最小前门调整集的线性时间算法。我们以多种编程语言提供了算法的实现,以促进实际应用,并通过实验验证了其可行性,即使对于大型图也是如此。