In this paper, we investigate the theoretical properties of stochastic gradient descent (SGD) for statistical inference in the context of nonconvex optimization problems, which have been relatively unexplored compared to convex settings. Our study is the first to establish provable inferential procedures using the SGD estimator for general nonconvex objective functions, which may contain multiple local minima. We propose two novel online inferential procedures that combine SGD and the multiplier bootstrap technique. The first procedure employs a consistent covariance matrix estimator, and we establish its error convergence rate. The second procedure approximates the limit distribution using bootstrap SGD estimators, yielding asymptotically valid bootstrap confidence intervals. We validate the effectiveness of both approaches through numerical experiments. Furthermore, our analysis yields an intermediate result: the in-expectation error convergence rate for the original SGD estimator in nonconvex settings, which is comparable to existing results for convex problems. We believe this novel finding holds independent interest and enriches the literature on optimization and statistical inference.
翻译:本文研究了随机梯度下降法在非凸优化问题中用于统计推断的理论性质,与凸优化情形相比,该领域研究相对较少。我们首次针对可能包含多个局部极小值的通用非凸目标函数,建立了基于SGD估计器的可证明推断流程。我们提出两种结合SGD与乘子自举技术的新型在线推断方法:第一种方法采用一致的协方差矩阵估计器,并建立了其误差收敛速度;第二种方法利用自举SGD估计器逼近极限分布,从而得到渐近有效的自举置信区间。通过数值实验验证了两种方法的有效性。此外,我们的分析得出一个中间结果:非凸设定下原始SGD估计器的期望误差收敛速度与凸问题的现有结果相当。我们认为这一新发现具有独立研究价值,并丰富了优化与统计推断领域的文献。