Causal effect estimation from observational data is a fundamental task in empirical sciences. It becomes particularly challenging when unobserved confounders are involved in a system. This paper focuses on front-door adjustment -- a classic technique which, using observed mediators allows to identify causal effects even in the presence of unobserved confounding. While the statistical properties of the front-door estimation are quite well understood, its algorithmic aspects remained unexplored for a long time. Recently, Jeong, Tian, and Barenboim [NeurIPS 2022] have presented the first polynomial-time algorithm for finding sets satisfying the front-door criterion in a given directed acyclic graph (DAG), with an $O(n^3(n+m))$ run time, where $n$ denotes the number of variables and $m$ the number of edges of the causal graph. In our work, we give the first linear-time, i.e., $O(n+m)$, algorithm for this task, which thus reaches the asymptotically optimal time complexity. This result implies an $O(n(n+m))$ delay enumeration algorithm of all front-door adjustment sets, again improving previous work by Jeong et al.\ by a factor of $n^3$. Moreover, we provide the first linear-time algorithm for finding a minimal front-door adjustment set. We offer implementations of our algorithms in multiple programming languages to facilitate practical usage and empirically validate their feasibility, even for large graphs.
翻译:从观测数据中估计因果效应是实证科学中的一项基础任务。当系统中存在未观测到的混杂因素时,这一问题变得尤为具有挑战性。本文聚焦于前门调整——一种经典技术,它通过利用观测到的中介变量,即使在存在未观测混杂的情况下也能识别因果效应。尽管前门估计的统计性质已被充分理解,但其算法方面长期以来仍未被探索。近期,Jeong、Tian和Barenboim [NeurIPS 2022] 提出了首个多项式时间算法,用于在给定有向无环图(DAG)中寻找满足前门准则的集合,该算法运行时间为$O(n^3(n+m))$,其中$n$表示因果图中的变量数量,$m$表示边数。在我们的工作中,我们为这一任务首次提出了线性时间(即$O(n+m)$)算法,从而达到了渐近最优的时间复杂度。这一结果意味着我们能够以$O(n(n+m))$的延迟枚举所有前门调整集合,相较Jeong等人的先前工作,再次将复杂度提升了$n^3$倍。此外,我们还提供了首个用于寻找最小前门调整集合的线性时间算法。我们提供了多种编程语言的算法实现,以促进其实用性,并通过经验验证了其在大规模图上的可行性。