Testing conditional independence has many applications, such as in Bayesian network learning and causal discovery. Different test methods have been proposed. However, existing methods generally can not work when only discretized observations are available. Specifically, consider $X_1$, $\tilde{X}_2$ and $X_3$ are observed variables, where $\tilde{X}_2$ is a discretization of latent variables $X_2$. Applying existing test methods to the observations of $X_1$, $\tilde{X}_2$ and $X_3$ can lead to a false conclusion about the underlying conditional independence of variables $X_1$, $X_2$ and $X_3$. Motivated by this, we propose a conditional independence test specifically designed to accommodate the presence of such discretization. To achieve this, we design the bridge equations to recover the parameter reflecting the statistical information of the underlying latent continuous variables. An appropriate test statistic and its asymptotic distribution under the null hypothesis of conditional independence have also been derived. Both theoretical results and empirical validation have been provided, demonstrating the effectiveness of our test methods.
翻译:条件独立性检验在贝叶斯网络学习和因果发现等领域具有重要应用。目前已提出多种检验方法,但现有方法通常无法在仅能获得离散化观测数据的情况下有效工作。具体而言,假设 $X_1$、$\tilde{X}_2$ 和 $X_3$ 为观测变量,其中 $\tilde{X}_2$ 是潜变量 $X_2$ 的离散化结果。将现有检验方法直接应用于 $X_1$、$\tilde{X}_2$ 和 $X_3$ 的观测数据,可能导致对底层变量 $X_1$、$X_2$ 和 $X_3$ 之间条件独立性的错误判断。基于此,我们提出了一种专门适用于此类离散化场景的条件独立性检验方法。为实现这一目标,我们构建了桥方程以恢复反映潜在连续变量统计信息的参数。同时推导了条件独立性零假设下适用的检验统计量及其渐近分布。本文提供了理论证明与实证验证,结果表明所提检验方法具有显著有效性。