Testing conditional independence has many applications, such as in Bayesian network learning and causal discovery. Different test methods have been proposed. However, existing methods generally can not work when only discretized observations are available. Specifically, consider $X_1$, $\tilde{X}_2$ and $X_3$ are observed variables, where $\tilde{X}_2$ is a discretization of latent variables $X_2$. Applying existing test methods to the observations of $X_1$, $\tilde{X}_2$ and $X_3$ can lead to a false conclusion about the underlying conditional independence of variables $X_1$, $X_2$ and $X_3$. Motivated by this, we propose a conditional independence test specifically designed to accommodate the presence of such discretization. To achieve this, we design the bridge equations to recover the parameter reflecting the statistical information of the underlying latent continuous variables. An appropriate test statistic and its asymptotic distribution under the null hypothesis of conditional independence have also been derived. Both theoretical results and empirical validation have been provided, demonstrating the effectiveness of our test methods.
翻译:条件独立性检验在贝叶斯网络学习和因果发现等领域具有广泛应用。目前已有多种检验方法被提出,然而当仅能获得离散化观测数据时,现有方法通常无法适用。具体而言,假设 $X_1$、$\tilde{X}_2$ 和 $X_3$ 为观测变量,其中 $\tilde{X}_2$ 是潜变量 $X_2$ 的离散化结果。将现有检验方法直接应用于 $X_1$、$\tilde{X}_2$ 和 $X_3$ 的观测数据,可能导致对底层变量 $X_1$、$X_2$ 和 $X_3$ 之间条件独立性的错误推断。基于此,本文提出一种专门适用于处理此类离散化现象的条件独立性检验方法。为实现这一目标,我们构建了桥方程以复原反映潜在连续变量统计信息的参数,并推导出零假设(条件独立性成立)下适用的检验统计量及其渐近分布。理论证明与实证验证结果均表明,本研究所提出的检验方法具有显著的有效性。