We consider an on-line least squares regression problem with optimal solution $\theta^*$ and Hessian matrix H, and study a time-average stochastic gradient descent estimator of $\theta^*$. For $k\ge2$, we provide an unbiased estimator of $\theta^*$ that is a modification of the time-average estimator, runs with an expected number of time-steps of order k, with O(1/k) expected excess risk. The constant behind the O notation depends on parameters of the regression and is a poly-logarithmic function of the smallest eigenvalue of H. We provide both a biased and unbiased estimator of the expected excess risk of the time-average estimator and of its unbiased counterpart, without requiring knowledge of either H or $\theta^*$. We describe an "average-start" version of our estimators with similar properties. Our approach is based on randomized multilevel Monte Carlo. Our numerical experiments confirm our theoretical findings.
翻译:我们考虑一个最优解为$\theta^*$且海森矩阵为H的在线最小二乘回归问题,并研究$\theta^*$的时间平均随机梯度下降估计器。对于$k\ge2$,我们提出一种$\theta^*$的无偏估计器,该估计器是对时间平均估计器的改进,其期望时间步数为k阶,且具有O(1/k)的期望超额风险。O符号后的常数取决于回归参数,并且是H最小特征值的多对数函数。我们分别给出了时间平均估计器及其无偏对应项的期望超额风险的有偏和无偏估计量,且无需事先知晓H或$\theta^*$。我们描述了具有类似性质的"平均启动"版本估计器。我们的方法基于随机化多级蒙特卡洛方法。数值实验验证了我们的理论结果。