In this paper, we explore two fundamental first-order algorithms in convex optimization, namely, gradient descent (GD) and proximal gradient method (ProxGD). Our focus is on making these algorithms entirely adaptive by leveraging local curvature information of smooth functions. We propose adaptive versions of GD and ProxGD that are based on observed gradient differences and, thus, have no added computational costs. Moreover, we prove convergence of our methods assuming only local Lipschitzness of the gradient. In addition, the proposed versions allow for even larger stepsizes than those initially suggested in [MM20].
翻译:本文探讨凸优化中两种基本的一阶算法,即梯度下降法(GD)和邻近梯度法(ProxGD)。我们重点关注通过利用光滑函数的局部曲率信息,使这些算法完全自适应。我们提出了基于观测梯度差分的GD和ProxGD自适应版本,因此无需增加额外计算成本。此外,我们仅假设梯度的局部Lipschitz连续性,证明了所提方法的收敛性。另外,这些改进版本允许采用比文献[MM20]最初建议的更大的步长。