We propose a sequential, anytime-valid method to test the conditional independence of a response $Y$ and a predictor $X$ given a random vector $Z$. The proposed test is based on e-statistics and test martingales, which generalize likelihood ratios and allow valid inference at arbitrary stopping times. In accordance with the recently introduced model-X setting, our test depends on the availability of the conditional distribution of $X$ given $Z$, or at least a sufficiently sharp approximation thereof. Within this setting, we derive a general method for constructing e-statistics for testing conditional independence, show that it leads to growth-rate optimal e-statistics for simple alternatives, and prove that our method yields tests with asymptotic power one in the special case of a logistic regression model. A simulation study is done to demonstrate that the approach is competitive in terms of power when compared to established sequential and nonsequential testing methods, and robust with respect to violations of the model-X assumption.
翻译:我们提出一种序贯、随时有效的方法,用于检验响应变量$Y$与预测变量$X$在给定随机向量$Z$条件下的独立性。该方法基于e统计量和检验鞅,这些概念推广了似然比,允许在任意停止时间进行有效推断。根据近期提出的模型-X设定,本检验依赖于$X$给定$Z$的条件分布可用性,或至少其足够精确的近似。在此设定下,我们推导出构建条件独立性检验e统计量的通用方法,证明该方法在简单备择假设下能达到最优增长率,并证实其在逻辑回归模型特例中具有渐近功效1。模拟研究表明:与既有序贯及非序贯检验方法相比,该方法在统计功效方面具有竞争力,且对模型-X假设违背情形具有稳健性。