We consider the problem of inference for projection parameters in linear regression with increasing dimensions. This problem has been studied under a variety of assumptions in the literature. The classical asymptotic normality result for the least squares estimator of the projection parameter only holds when the dimension $d$ of the covariates is of a smaller order than $n^{1/2}$, where $n$ is the sample size. Traditional sandwich estimator-based Wald intervals are asymptotically valid in this regime. In this work, we propose a bias correction for the least squares estimator and prove the asymptotic normality of the resulting debiased estimator. Precisely, we provide an explicit finite sample Berry Esseen bound on the Normal approximation to the law of the linear contrasts of the proposed estimator normalized by the sandwich standard error estimate. Our bound, under only finite moment conditions on covariates and errors, tends to 0 as long as $d = o(n^{2/3})$ up to the polylogarithmic factors. Furthermore, we leverage recent methods of statistical inference that do not require an estimator of the variance to perform asymptotically valid statistical inference and that leads to a sharper miscoverage control compared to Wald's. We provide a discussion of how our techniques can be generalized to increase the allowable range of $d$ even further.
翻译:我们考虑在维度递增情形下线性回归中投影参数的推断问题。现有文献已在多种假设条件下对该问题展开研究。经典的最小二乘估计量关于投影参数的渐近正态性结果仅适用于协变量维度 $d$ 的阶数小于 $n^{1/2}$(其中 $n$ 为样本量)的情形,此时基于传统三明治估计量的Wald区间具有渐近有效性。本文提出一种最小二乘估计量的偏差校正方法,并证明所得去偏估计量的渐近正态性。具体而言,我们给出了该估计量线性对比经三明治标准误估计量标准化后的分布的显式有限样本Berry-Esseen界。在仅假设协变量与误差项具有有限矩的条件下,当 $d = o(n^{2/3})$(忽略多对数因子)时该界趋于零。此外,我们借助无需方差估计量的最新统计推断方法实现渐近有效推断,该方法相较于Wald方法可获得更优的覆盖误差控制。最后讨论如何推广我们的技术以进一步拓展 $d$ 的允许范围。