We investigate the problem of statistical inference for logistic regression with high-dimensional covariates in settings where dependence among individuals is induced by an underlying Markov random field. Going beyond the pairwise interaction models such as the Ising model, we consider a framework to accommodate more general tensor structures that capture higher-order dependencies. We develop a two-step procedure for low-dimensional linear and quadratic functionals. The first step constructs a regularized maximum pseudolikelihood estimator, for which we establish consistency under high-dimensional features. However, as in other classical high-dimensional regression problems, this estimator is biased and cannot be directly used for valid statistical inference. The second step introduces a bias-correction that yields an asymptotically normal estimator from which one can construct confidence intervals and test hypotheses. Our results move beyond the existing literature, where only estimation guarantees were available or only for pairwise interaction models. We complement our theoretical analysis with simulation studies confirming the effectiveness of the proposed method.
翻译:我们研究了在个体间依赖由底层马尔可夫随机场诱导的高维协变量情景下,逻辑回归的统计推断问题。超越了伊辛模型等成对交互模型,我们提出一个能够容纳捕捉高阶依赖的更一般张量结构的框架。我们针对低维线性和二次泛函开发了一个两步估计程序。第一步构建正则化最大伪似然估计量,并证明了该估计量在高维特征下的一致性。然而,如同其他经典高维回归问题,该估计量存在偏差,无法直接用于有效的统计推断。第二步引入偏差校正,得到渐近正态的估计量,进而可以构建置信区间和检验假设。我们的结果超越了现有文献——现有文献要么仅提供估计保证,要么仅适用于成对交互模型。我们通过模拟研究补充了理论分析,证实了所提方法的有效性。