We present a novel class of projected gradient (PG) methods for minimizing a smooth but not necessarily convex function over a convex compact set. We first provide a novel analysis of the "vanilla" PG method, achieving the best-known iteration complexity for finding an approximate stationary point of the problem. We then develop an "auto-conditioned" projected gradient (AC-PG) variant that achieves the same iteration complexity without requiring the input of the Lipschitz constant of the gradient or any line search procedure. The key idea is to estimate the Lipschitz constant using first-order information gathered from the previous iterations, and to show that the error caused by underestimating the Lipschitz constant can be properly controlled. We then generalize the PG methods to the stochastic setting, by proposing a stochastic projected gradient (SPG) method and a variance-reduced stochastic gradient (VR-SPG) method, achieving new complexity bounds in different oracle settings. We also present auto-conditioned stepsize policies for both stochastic PG methods and establish comparable convergence guarantees.
翻译:本文提出了一类新颖的投影梯度(PG)方法,用于在凸紧集上最小化光滑但不一定凸的函数。我们首先对“基础版”PG方法进行了创新性分析,在寻找问题近似稳定点方面达到了已知最优的迭代复杂度。随后,我们开发了一种“自适应调节”投影梯度(AC-PG)变体,该变体在不需输入梯度Lipschitz常数或任何线搜索过程的情况下,实现了相同的迭代复杂度。其核心思想是利用先前迭代收集的一阶信息估计Lipschitz常数,并证明低估Lipschitz常数所产生的误差能够得到有效控制。接着,我们将PG方法推广至随机优化场景,提出了随机投影梯度(SPG)方法与方差缩减随机梯度(VR-SPG)方法,在不同预言机设定下获得了新的复杂度界。我们还为两种随机PG方法设计了自适应调节步长策略,并建立了可比较的收敛性保证。