The construction of confidence intervals and hypothesis tests for functionals is a cornerstone of statistical inference. Traditionally, the most efficient procedures - such as the Wald interval or the Likelihood Ratio Test - require both a point estimator and a consistent estimate of its asymptotic variance. However, when estimators are derived from online or sequential algorithms, computational constraints often preclude multiple passes over the data, complicating variance estimation. In this article, we propose a computationally efficient, rate-optimal wrapper method (HulC) that wraps around any online algorithm to produce asymptotically valid confidence regions bypassing the need for explicit asymptotic variance estimation. The method is provably valid for any online algorithm that yields an asymptotically normal estimator. We evaluate the practical performance of the proposed method primarily using Stochastic Gradient Descent (SGD) with Polyak-Ruppert averaging. Furthermore, we provide extensive numerical simulations comparing the performance of our approach (HulC) when used with other online algorithms, including implicit-SGD and ROOT-SGD.
翻译:函数置信区间构建与假设检验是统计推断的核心。传统上,最有效的方法(如Wald区间或似然比检验)需要同时获得点估计量及其渐近方差的一致性估计。然而,当估计量源自在线或序列算法时,计算约束通常禁止对数据进行多次遍历,导致方差估计复杂化。本文提出一种计算高效、速率最优的包装方法(HulC),该方法可封装任意在线算法,在无需显式渐近方差估计的情况下生成渐近有效的置信区域。该方法的有效性通过任何能生成渐近正态估计量的在线算法得到证明。我们主要利用带Polyak-Ruppert平均的随机梯度下降(SGD)评估所提方法的实际性能。此外,我们提供大量数值模拟,比较该方法(HulC)与其他在线算法(包括implicit-SGD与ROOT-SGD)联合使用时的表现。