Detecting and measuring confounding effects from data is a key challenge in causal inference. Existing methods frequently assume causal sufficiency, disregarding the presence of unobserved confounding variables. Causal sufficiency is both unrealistic and empirically untestable. Additionally, existing methods make strong parametric assumptions about the underlying causal generative process to guarantee the identifiability of confounding variables. Relaxing the causal sufficiency and parametric assumptions and leveraging recent advancements in causal discovery and confounding analysis with non-i.i.d. data, we propose a comprehensive approach for detecting and measuring confounding. We consider various definitions of confounding and introduce tailored methodologies to achieve three objectives: (i) detecting and measuring confounding among a set of variables, (ii) separating observed and unobserved confounding effects, and (iii) understanding the relative strengths of confounding bias between different sets of variables. We present useful properties of a confounding measure and present measures that satisfy those properties. Empirical results support the theoretical analysis.
翻译:从数据中检测和度量混杂效应是因果推断的一个关键挑战。现有方法通常假设因果充分性,忽略了未观测混杂变量的存在。因果充分性既不现实,也无法通过经验检验。此外,现有方法对基础因果生成过程做出强参数化假设,以保证混杂变量的可识别性。通过放宽因果充分性和参数化假设,并利用非独立同分布数据在因果发现与混杂分析领域的最新进展,我们提出了一种用于检测和度量混杂效应的综合方法。我们考虑了混杂的各种定义,并引入了定制化的方法论以实现三个目标:(i) 检测和度量一组变量间的混杂效应,(ii) 分离已观测与未观测的混杂效应,以及 (iii) 理解不同变量集之间混杂偏差的相对强度。我们提出了混杂度量应具备的有用性质,并给出了满足这些性质的度量方法。实证结果支持了理论分析。