Algorithmic reproducibility measures the deviation in outputs of machine learning algorithms upon minor changes in the training process. Previous work suggests that first-order methods would need to trade-off convergence rate (gradient complexity) for better reproducibility. In this work, we challenge this perception and demonstrate that both optimal reproducibility and near-optimal convergence guarantees can be achieved for smooth convex minimization and smooth convex-concave minimax problems under various error-prone oracle settings. Particularly, given the inexact initialization oracle, our regularization-based algorithms achieve the best of both worlds - optimal reproducibility and near-optimal gradient complexity - for minimization and minimax optimization. With the inexact gradient oracle, the near-optimal guarantees also hold for minimax optimization. Additionally, with the stochastic gradient oracle, we show that stochastic gradient descent ascent is optimal in terms of both reproducibility and gradient complexity. We believe our results contribute to an enhanced understanding of the reproducibility-convergence trade-off in the context of convex optimization.
翻译:算法可复现性衡量机器学习算法在训练过程微小变化下输出结果的偏差。以往研究表明,一阶方法需要在收敛速度(梯度复杂度)与更优的可复现性之间进行权衡。本研究挑战了这一认知,证明在多种含误差的预言机设置下,光滑凸最小化问题与光滑凸-凹极小极大问题均可同时实现最优可复现性与近乎最优的收敛保证。具体而言,在初始预言机不精确的场景下,基于正则化的算法在最小化与极小极大优化中实现了"鱼与熊掌兼得"——最优可复现性与近乎最优的梯度复杂度。当梯度预言机不精确时,极小极大优化同样获得近乎最优保证。此外,针对随机梯度预言机,我们证明随机梯度上升-下降法在可复现性与梯度复杂度两方面均达到最优。我们相信这些结果将加深对凸优化中可复现性-收敛性权衡的理解。