Stochastic gradient descent (SGD) has emerged as the quintessential method in a data scientist's toolbox. Much progress has been made in the last two decades toward understanding the iteration complexity of SGD (in expectation and high-probability) in the learning theory and optimization literature. However, using SGD for high-stakes applications requires careful quantification of the associated uncertainty. Toward that end, in this work, we establish high-dimensional Central Limit Theorems (CLTs) for linear functionals of online least-squares SGD iterates under a Gaussian design assumption. Our main result shows that a CLT holds even when the dimensionality is of order exponential in the number of iterations of the online SGD, thereby enabling high-dimensional inference with online SGD. Our proof technique involves leveraging Berry-Esseen bounds developed for martingale difference sequences and carefully evaluating the required moment and quadratic variation terms through recent advances in concentration inequalities for product random matrices. We also provide an online approach for estimating the variance appearing in the CLT (required for constructing confidence intervals in practice) and establish consistency results in the high-dimensional setting.
翻译:随机梯度下降(SGD)已成为数据科学家工具箱中的核心方法。过去二十年间,学习理论与优化文献在理解SGD的迭代复杂度(期望与高概率意义下)方面取得了显著进展。然而,将SGD应用于高风险场景需谨慎量化其相关不确定性。为此,本文在高斯设计假设下,建立了关于在线最小二乘SGD迭代线性泛函的高维中心极限定理(CLTs)。我们的主要结果表明,即使数据维度关于在线SGD迭代次数呈指数级增长,CLT依然成立,从而实现在线SGD的高维统计推断。证明技术利用针对鞅差序列建立的Berry-Esseen界,并结合乘积随机矩阵集中不等式的最新进展,对所需的矩与二次变分项进行精细评估。此外,我们提出在线估计CLT中方差(实际应用中构建置信区间的必要条件)的方法,并在高维框架下建立了相合性结论。