Inferring the effect of interventions within complex systems is a fundamental problem of statistics. A widely studied approach employs structural causal models that postulate noisy functional relations among a set of interacting variables. The underlying causal structure is then naturally represented by a directed graph whose edges indicate direct causal dependencies. In a recent line of work, additional assumptions on the causal models have been shown to render this causal graph identifiable from observational data alone. One example is the assumption of linear causal relations with equal error variances that we will take up in this work. When the graph structure is known, classical methods may be used for calculating estimates and confidence intervals for causal effects. However, in many applications, expert knowledge that provides an a priori valid causal structure is not available. Lacking alternatives, a commonly used two-step approach first learns a graph and then treats the graph as known in inference. This, however, yields confidence intervals that are overly optimistic and fail to account for the data-driven model choice. We argue that to draw reliable conclusions, it is necessary to incorporate the remaining uncertainty about the underlying causal structure in confidence statements about causal effects. To address this issue, we present a framework based on test inversion that allows us to give confidence regions for total causal effects that capture both sources of uncertainty: causal structure and numerical size of nonzero effects.
翻译:推断复杂系统中干预措施的影响是统计学的基本问题。一类广泛研究的方法采用结构因果模型,该模型假设一组交互变量之间存在噪声函数关系。相应的因果结构自然地由有向图表示,其中边表示直接因果依赖关系。在最近的研究中,对因果模型施加额外假设已被证明可以从观测数据中识别该因果图。本研究将采用的假设之一是具有等误差方差的线性因果关系。当图结构已知时,可使用经典方法计算因果效应的估计值和置信区间。然而,在许多应用中,缺乏先验有效因果结构的专家知识。在没有替代方法的情况下,常用的两步法首先学习图结构,然后在推断中将其视为已知。但这会产生过于乐观的置信区间,无法反映数据驱动的模型选择。我们认为,为得出可靠结论,必须将关于潜在因果结构的剩余不确定性纳入因果效应的置信表述中。为解决该问题,我们提出基于检验反转的框架,该框架能够为总因果效应提供同时涵盖两种不确定性来源(因果结构和非零效应的数值大小)的置信区域。