High-dimensional inference methods often rely on coefficient sparsity, an assumption that can be restrictive when signals are dense but individually weak. In such settings, valid inference may still be possible if the covariates exhibit sparse conditional dependence. Motivated by this observation, we propose Neighborhood-Localized Nested Regression (NLNR), a framework for coordinatewise inference in high-dimensional linear models with potentially dense coefficients. The central idea is to localize inference for a target coefficient to a low-dimensional working regression determined by a Sparse Conditional Neighborhood (SCN) of the target covariate. Specifically, for a given covariate, we estimate its SCN through nodewise $\ell_1$-penalized regression and then fit a regression using only the target covariate and its estimated neighborhood. Under suitable regularity conditions, we establish consistency and asymptotic normality of the resulting estimator. Building on this inferential reduction principle, we further develop a thresholding-based screening procedure with theoretical guarantees and a boosting variant that augments the working model with additional response-relevant covariates to improve finite-sample performance. Extensive simulations and an application to the CCLE dataset demonstrate favorable empirical performance.
翻译:高维推断方法通常依赖于系数稀疏性,这一假设在信号稠密但个体微弱时具有局限性。在此类场景中,若协变量呈现稀疏条件依赖性,仍可能实现有效推断。基于这一观察,我们提出邻域局部化嵌套回归(NLNR)框架,用于处理可能具有稠密系数的高维线性模型中的坐标方向推断。其核心思想是将目标系数的推断局域化到一个由目标协变量的稀疏条件邻域(SCN)决定的低维工作回归中。具体而言,对于给定协变量,我们通过节点式$\ell_1$惩罚回归估计其稀疏条件邻域,然后仅使用目标协变量及其估计邻域拟合回归。在适当正则性条件下,我们证明了所得估计量的一致性与渐近正态性。基于这一推断降阶原理,我们进一步开发了具有理论保证的阈值筛选程序,以及通过纳入额外响应相关协变量来增强工作模型以改进有限样本性能的Boosting变体。大量模拟实验与CCLE数据集上的应用展示了优越的实证性能。