Stochastic gradient descent (SGD) is a foundational algorithm for large-scale statistical learning and stochastic optimization. However, statistical inference based on SGD iterates remains challenging when stochastic gradients have infinite variance, as the relevant limiting distributions depend on unknown nuisance parameters. In this paper, we develop an efficient, model-agnostic methodology for constructing confidence regions from SGD trajectories that applies in both finite- and infinite-variance regimes. The procedure is based on a joint weak convergence result for the Polyak-Ruppert averaged estimator and an empirical second-moment normalizer constructed from stochastic gradients along the SGD trajectory. This joint limit yields a self-normalized statistic in which the leading tail-dependent scaling terms cancel. We then use a subsampling calibration scheme to estimate the relevant critical values, avoiding explicit estimation of tail indices, slowly varying functions, or stable-law parameters. The resulting confidence regions are straightforward to implement and are asymptotically valid under both the finite- and infinite-second-moment regimes. Simulation studies show reliable coverage in various settings, supporting the proposed method as a practical tool for uncertainty quantification in stochastic optimization.
翻译:随机梯度下降(SGD)是大规模统计学习和随机优化中的基础算法。然而,当随机梯度具有无限方差时,基于SGD迭代的统计推断仍然具有挑战性,因为相关的极限分布依赖于未知的干扰参数。本文提出了一种高效、模型无关的方法,用于从SGD轨迹构建置信区域,该方法适用于有限方差和无限方差两种情形。该过程基于Polyak-Ruppert平均估计量与沿着SGD轨迹从随机梯度中构造的经验二阶矩归一化量的联合弱收敛结果。这个联合极限产生了一个自归一化统计量,其中依赖于尾部的领先尺度项相互抵消。然后,我们使用子采样校准方案来估计相关的临界值,避免了尾指数、慢变函数或稳定律参数的显式估计。由此得到的置信区域易于实现,并且在有限二阶矩和无限二阶矩条件下均具有渐近有效性。仿真研究显示了在各种设置下的可靠覆盖范围,支持所提出的方法作为随机优化中不确定性量化的实用工具。