This paper proposes an asymptotic theory for online inference of the stochastic gradient descent (SGD) iterates with dropout regularization in linear regression. Specifically, we establish the geometric-moment contraction (GMC) for constant step-size SGD dropout iterates to show the existence of a unique stationary distribution of the dropout recursive function. By the GMC property, we provide quenched central limit theorems (CLT) for the difference between dropout and $\ell^2$-regularized iterates, regardless of initialization. The CLT for the difference between the Ruppert-Polyak averaged SGD (ASGD) with dropout and $\ell^2$-regularized iterates is also presented. Based on these asymptotic normality results, we further introduce an online estimator for the long-run covariance matrix of ASGD dropout to facilitate inference in a recursive manner with efficiency in computational time and memory. The numerical experiments demonstrate that for sufficiently large samples, the proposed confidence intervals for ASGD with dropout nearly achieve the nominal coverage probability.
翻译:本文提出了一种用于线性回归中带Dropout正则化的随机梯度下降(SGD)迭代在线推断的渐近理论。具体而言,我们为恒定步长SGD Dropout迭代建立了几何矩收缩(GMC)性质,从而证明Dropout递归函数存在唯一的平稳分布。基于GMC性质,我们给出了Dropout迭代与$\ell^2$正则化迭代之间差异的淬火中心极限定理(CLT),且该结果与初始化方式无关。同时,我们还提出了带Dropout的Ruppert-Polyak平均SGD(ASGD)与$\ell^2$正则化迭代之间差异的CLT。基于这些渐近正态性结果,我们进一步引入了一种ASGD Dropout长时协方差矩阵的在线估计器,该估计器能以递归方式高效地进行推断,并在计算时间和内存使用上具有优势。数值实验表明,对于足够大的样本量,所提出的带Dropout的ASGD置信区间几乎能够达到名义覆盖概率。