Distinguishing causal connections from correlations is important in many scenarios. However, the presence of unobserved variables, such as the latent confounder, can introduce bias in conditional independence testing commonly employed in constraint-based causal discovery for identifying causal relations. To address this issue, existing methods introduced proxy variables to adjust for the bias caused by unobserveness. However, these methods were either limited to categorical variables or relied on strong parametric assumptions for identification. In this paper, we propose a novel hypothesis-testing procedure that can effectively examine the existence of the causal relationship over continuous variables, without any parametric constraint. Our procedure is based on discretization, which under completeness conditions, is able to asymptotically establish a linear equation whose coefficient vector is identifiable under the causal null hypothesis. Based on this, we introduce our test statistic and demonstrate its asymptotic level and power. We validate the effectiveness of our procedure using both synthetic and real-world data.
翻译:区分因果关系与相关性在许多场景中至关重要。然而,未观测变量(如潜在混淆因子)的存在,会在基于约束的因果发现中常用的条件独立性检验中引入偏差,从而影响因果关系的识别。为解决该问题,现有方法引入代理变量来调整由不可观测性引起的偏差。但这些方法要么局限于分类变量,要么依赖强参数假设进行识别。本文提出一种新颖的假设检验程序,能在无需任何参数约束的情况下有效检验连续变量间因果关系是否存在。该程序基于离散化方法,在满足完备性条件时,可渐进建立线性方程,其系数向量在因果零假设下可识别。据此,我们构建检验统计量并证明其渐进水平与功效。通过合成数据与真实数据验证了该程序的有效性。