Incremental gradient methods and incremental proximal methods are a fundamental class of optimization algorithms used for solving finite sum problems, broadly studied in the literature. Yet, when it comes to their convergence guarantees, nonasymptotic (first-order or proximal) oracle complexity bounds have been obtained fairly recently, almost exclusively applying to the average iterate. Motivated by applications in continual learning, we obtain the first convergence guarantees for the last iterate of both incremental gradient and incremental proximal methods, in general convex smooth (for both) and convex Lipschitz (for the proximal variants) settings. Our oracle complexity bounds for the last iterate nearly match (i.e., match up to a square-root-log or a log factor) the best known oracle complexity bounds for the average iterate, for both classes of methods. We further obtain generalizations of our results to weighted averaging of the iterates with increasing weights, which can be seen as interpolating between the last iterate and the average iterate guarantees. Additionally, we discuss how our results can be generalized to variants of studied incremental methods with permuted ordering of updates. Our results generalize last iterate guarantees for incremental methods compared to state of the art, as such results were previously known only for overparameterized linear models, which correspond to convex quadratic problems with infinitely many solutions.
翻译:增量梯度方法和增量近端方法是求解有限和问题的一类基础优化算法,在文献中得到了广泛研究。然而,关于其收敛性保证,非渐近(一阶或近端)Oracle复杂度边界直到最近才被获得,且几乎完全适用于平均迭代。受持续学习应用的启发,我们首次获得了增量梯度方法和增量近端方法在一般凸光滑(两种方法)和凸Lipschitz(近端变体)设定下最后迭代的收敛性保证。我们针对这两类方法最后迭代的Oracle复杂度边界几乎匹配(即最多相差平方根对数或对数因子)已知的最优平均迭代Oracle复杂度边界。我们进一步将结果推广到具有递增权重的迭代加权平均方法,该方法可视为最后迭代与平均迭代保证之间的插值。此外,我们讨论了如何将结果推广到具有置换更新顺序的增量方法变体。与现有技术相比,我们的结果推广了增量方法的最后迭代保证——此前此类结果仅已知用于过参数化线性模型(即对应具有无穷多解的凸二次问题)。