In this paper, we explore two fundamental first-order algorithms in convex optimization, namely, gradient descent (GD) and proximal gradient method (ProxGD). Our focus is on making these algorithms entirely adaptive by leveraging local curvature information of smooth functions. We propose adaptive versions of GD and ProxGD that are based on observed gradient differences and, thus, have no added computational costs. Moreover, we prove convergence of our methods assuming only local Lipschitzness of the gradient. In addition, the proposed versions allow for even larger stepsizes than those initially suggested in [MM20].
翻译:本文研究了凸优化中两种基本的一阶算法:梯度下降(GD)与近端梯度方法(ProxGD)。我们通过利用光滑函数的局部曲率信息,致力于使这两种算法实现完全自适应。基于观测到的梯度差异,我们提出了GD和ProxGD的自适应版本,且这些版本无需增加额外计算成本。此外,我们仅需假设梯度的局部Lipschitz连续性即可证明所提方法的收敛性。值得注意的是,所提出的版本允许采用比[MM20]中最初建议的更大的步长。