Algorithmic reproducibility measures the deviation in outputs of machine learning algorithms upon minor changes in the training process. Previous work suggests that first-order methods would need to trade-off convergence rate (gradient complexity) for better reproducibility. In this work, we challenge this perception and demonstrate that both optimal reproducibility and near-optimal convergence guarantees can be achieved for smooth convex minimization and smooth convex-concave minimax problems under various error-prone oracle settings. Particularly, given the inexact initialization oracle, our regularization-based algorithms achieve the best of both worlds - optimal reproducibility and near-optimal gradient complexity - for minimization and minimax optimization. With the inexact gradient oracle, the near-optimal guarantees also hold for minimax optimization. Additionally, with the stochastic gradient oracle, we show that stochastic gradient descent ascent is optimal in terms of both reproducibility and gradient complexity. We believe our results contribute to an enhanced understanding of the reproducibility-convergence trade-off in the context of convex optimization.
翻译:算法可复现性衡量机器学习算法在训练过程发生微小变化时输出结果的偏差。已有研究表明,一阶方法需要在收敛速度(梯度复杂度)与可复现性之间进行权衡。本文对此观点提出挑战,证明在多种含误差的预言机设置下,光滑凸最小化问题与光滑凸-凹极小极大问题可同时实现最优可复现性与近最优收敛保证。特别地,在初始值不精确预言机条件下,基于正则化的算法在最小化与极小极大优化中均能实现双重最优——最优可复现性与近最优梯度复杂度。对于梯度不精确预言机,极小极大优化同样可获得近最优保证。此外,在随机梯度预言机场景下,我们证明随机梯度下降上升法在可复现性与梯度复杂度两个维度均达到最优。本研究成果有助于深化对凸优化中可复现性-收敛性权衡机制的理解。