This paper focuses on stochastic methods for solving smooth non-convex strongly-concave min-max problems, which have received increasing attention due to their potential applications in deep learning (e.g., deep AUC maximization, distributionally robust optimization). However, most of the existing algorithms are slow in practice, and their analysis revolves around the convergence to a nearly stationary point.We consider leveraging the Polyak-Lojasiewicz (PL) condition to design faster stochastic algorithms with stronger convergence guarantee. Although PL condition has been utilized for designing many stochastic minimization algorithms, their applications for non-convex min-max optimization remain rare. In this paper, we propose and analyze a generic framework of proximal stage-based method with many well-known stochastic updates embeddable. Fast convergence is established in terms of both the primal objective gap and the duality gap. Compared with existing studies, (i) our analysis is based on a novel Lyapunov function consisting of the primal objective gap and the duality gap of a regularized function, and (ii) the results are more comprehensive with improved rates that have better dependence on the condition number under different assumptions. We also conduct deep and non-deep learning experiments to verify the effectiveness of our methods.
翻译:本文聚焦于求解光滑非凸强凹极小极大问题的随机方法,这类问题因在深度学习(如深度AUC最大化、分布鲁棒优化)中的潜在应用而日益受到关注。然而,现有大多数算法在实践中收敛缓慢,其分析主要围绕近似稳定点的收敛性。我们考虑利用Polyak-Lojasiewicz(PL)条件设计具有更强收敛保证的快速随机算法。尽管PL条件已广泛应用于多种随机最小化算法设计,但其在非凸极小极大优化中的应用仍较为少见。本文提出并分析了一个基于近端分阶段的通用框架,支持嵌入多种经典随机更新方法。我们从原始目标间隙和对偶间隙两个角度建立了快速收敛性。与现有研究相比,(i)我们的分析基于一个新颖的李雅普诺夫函数,该函数由正则化函数的原始目标间隙和对偶间隙构成;(ii)结果更全面,在不同假设条件下具有更优的收敛速率,且对条件数的依赖性得到改善。我们还通过深度与非深度学习实验验证了所提方法的有效性。