Numerous Optimization Algorithms have a time-varying update rule thanks to, for instance, a changing step size, momentum parameter or, Hessian approximation. In this paper, we apply unrolled or automatic differentiation to a time-varying iterative process and provide convergence (rate) guarantees for the resulting derivative iterates. We adapt these convergence results and apply them to proximal gradient descent with variable step size and FISTA when solving partly smooth problems. We confirm our findings numerically by solving $\ell_1$ and $\ell_2$-regularized linear and logisitc regression respectively. Our theoretical and numerical results show that the convergence rate of the algorithm is reflected in its derivative iterates.
翻译:众多优化算法具有时变更新规则,这归因于例如变化的步长、动量参数或Hessian近似。在本文中,我们将展开或自动微分应用于时变迭代过程,并为由此产生的导数迭代提供收敛性(速率)保证。我们调整这些收敛性结果,并将其应用于求解部分光滑问题时具有可变步长的近端梯度下降法和FISTA。我们分别通过求解$\ell_1$和$\ell_2$正则化的线性和逻辑回归问题,在数值上验证了我们的发现。我们的理论和数值结果表明,算法的收敛速率在其导数迭代中得以体现。