Stochastic gradient descent (SGD) is a cornerstone algorithm for high-dimensional optimization, renowned for its empirical successes. Recent theoretical advances have provided a deep understanding of how SGD enables feature learning in high-dimensional nonlinear models, most notably the \textit{single-index model} with i.i.d. data. In this work, we study the sequential learning problem for single-index models, also known as generalized linear bandits or ridge bandits, where SGD is a simple and natural solution, yet its learning dynamics remain largely unexplored. We show that, similar to the optimal interactive learner, SGD undergoes a distinct ``burn-in'' phase before entering the ``learning'' phase in this setting. Moreover, with an appropriately chosen learning rate schedule, a single SGD procedure simultaneously achieves near-optimal (or best-known) sample complexity and regret guarantees across both phases, for a broad class of link functions. Our results demonstrate that SGD remains highly competitive for learning single-index models under adaptive data.
翻译:随机梯度下降(SGD)是高维优化的基石算法,以其卓越的实证性能而闻名。近期的理论进展深入揭示了SGD如何在高维非线性模型中实现特征学习,尤其是在具有独立同分布数据的\textit{单指标模型}中。在本研究中,我们探讨单指标模型的序贯学习问题(也称为广义线性赌博机或岭赌博机),其中SGD是一种简单而自然的解决方案,但其学习动态在很大程度上仍未得到探索。我们证明,在此设定下,与最优的交互式学习器类似,SGD在进入“学习”阶段之前会经历一个独特的“预热”阶段。此外,通过适当选择学习率调度策略,一个单一的SGD过程能够同时为广泛的链接函数类,在两个阶段中实现接近最优(或已知最佳)的样本复杂度和遗憾保证。我们的结果表明,在自适应数据下学习单指标模型时,SGD仍然具有高度的竞争力。