Online statistical inference facilitates real-time analysis of sequentially collected data, making it different from traditional methods that rely on static datasets. This paper introduces a novel approach to online inference in high-dimensional generalized linear models, where we update regression coefficient estimates and their standard errors upon each new data arrival. In contrast to existing methods that either require full dataset access or large-dimensional summary statistics storage, our method operates in a single-pass mode, significantly reducing both time and space complexity. The core of our methodological innovation lies in an adaptive stochastic gradient descent algorithm tailored for dynamic objective functions, coupled with a novel online debiasing procedure. This allows us to maintain low-dimensional summary statistics while effectively controlling optimization errors introduced by the dynamically changing loss functions. We demonstrate that our method, termed the Approximated Debiased Lasso (ADL), not only mitigates the need for the bounded individual probability condition but also significantly improves numerical performance. Numerical experiments demonstrate that the proposed ADL method consistently exhibits robust performance across various covariance matrix structures.
翻译:在线统计推断促进了序列收集数据的实时分析,使其区别于依赖静态数据集的传统方法。本文提出了一种在高维广义线性模型中进行在线推断的新方法,该方法在每次新数据到达时更新回归系数估计及其标准误。与现有方法需要完整数据集访问或大维度摘要统计量存储不同,我们的方法以单次遍历模式运行,显著降低了时间和空间复杂度。方法创新的核心在于针对动态目标函数设计的自适应随机梯度下降算法,以及一种新颖的在线去偏过程。这使得我们能够维持低维度摘要统计量,同时有效控制由动态变化损失函数引入的优化误差。我们证明了所提出的方法(称为近似去偏Lasso,ADL)不仅降低了对个体概率有界条件的依赖,还显著提升了数值性能。数值实验表明,所提出的ADL方法在不同协方差矩阵结构下均表现出稳健的性能。