The Potential Outcome Framework (POF) plays a prominent role in the field of causal inference. Most causal inference models based on the POF (CIMs-B-POF) are designed for eliminating confounding bias and default to an underlying assumption of Confounding Covariates. This assumption posits that the covariates consist solely of confounders. However, the assumption of Confounding Covariates is challenging to maintain in practice, particularly when dealing with high-dimensional covariates. While certain methods have been proposed to differentiate the distinct components of covariates prior to conducting causal inference, the consequences of treating non-confounding covariates as confounders remain unclear. This ambiguity poses a potential risk when applying the CIMs-B-POF in practical scenarios. In this paper, we present a unified graphical framework for the CIMs-B-POF, which greatly enhances the comprehension of these models' underlying principles. Using this graphical framework, we quantitatively analyze the extent to which the inference performance of CIMs-B-POF is influenced when incorporating various types of non-confounding covariates, such as instrumental variables, mediators, colliders, and adjustment variables. The key findings are: in the task of eliminating confounding bias, the optimal scenario is for the covariates to exclusively encompass confounders; in the subsequent task of inferring counterfactual outcomes, the adjustment variables contribute to more accurate inferences. Furthermore, extensive experiments conducted on synthetic datasets consistently validate these theoretical conclusions.
翻译:潜在结果框架在因果推断领域占据重要地位。基于该框架的大多数因果推断模型旨在消除混杂偏差,并默认遵循“混杂协变量”假设,即协变量仅由混杂因子构成。然而,在实际应用中,这一假设难以维持,尤其是在处理高维协变量时。尽管已有方法尝试在进行因果推断前区分协变量的不同成分,但将非混杂协变量视为混杂因子所产生的后果仍不明确。这种模糊性为在实际场景中应用基于潜在结果框架的因果推断模型带来了潜在风险。本文为这类因果推断模型提出了统一的图形化框架,极大增强了对其基本原理的理解。借助该图形化框架,我们定量分析了在纳入工具变量、中介变量、碰撞变量和调整变量等不同类型的非混杂协变量时,这些模型推断性能所受影响的程度。关键发现如下:在消除混杂偏差的任务中,最优情况是协变量仅包含混杂因子;而在后续的反事实结果推断任务中,调整变量有助于提高推断的准确性。此外,在合成数据集上进行的大量实验一致验证了这些理论结论。