Nonconvex-nonconcave minimax optimization has gained widespread interest over the last decade. However, most existing work focuses on variants of gradient descent-ascent (GDA) algorithms, which are only applicable in smooth nonconvex-concave settings. To address this limitation, we propose a novel algorithm named smoothed proximal linear descent-ascent (smoothed PLDA), which can effectively handle a broad range of structured nonsmooth nonconvex-nonconcave minimax problems. Specifically, we consider the setting where the primal function has a nonsmooth composite structure and the dual function possesses the Kurdyka-\L{}ojasiewicz (K\L{}) property with exponent $\theta \in [0,1)$. We introduce a novel convergence analysis framework for smoothed PLDA, the key components of which are our newly developed nonsmooth primal error bound and dual error bound properties. Using this framework, we show that smoothed PLDA can find both $\epsilon$-game-stationary points and $\epsilon$-optimization-stationary points of the problems of interest in $\mathcal{O}(\epsilon^{-2\max\{2\theta,1\}})$ iterations. Furthermore, when $\theta \in [0,1/2]$, smoothed PLDA achieves the optimal iteration complexity of $\mathcal{O}(\epsilon^{-2})$. To further demonstrate the effectiveness and wide applicability of our analysis framework, we show that certain max-structure problem possesses the K\L{} property with exponent $\theta=0$ under mild assumptions. As a by-product, we establish algorithm-independent quantitative relationships among various stationarity concepts, which may be of independent interest.
翻译:非凸-非凹极小极大优化在过去十年中引起了广泛关注。然而,现有工作大多集中于梯度下降-上升(GDA)算法的变体,这些变体仅适用于光滑的非凸-凹设置。为解决此局限性,我们提出了一种名为光滑近端线性下降-上升(smoothed PLDA)的新型算法,该算法可有效处理一类广泛的结构化非光滑非凸-非凹极小极大问题。具体而言,我们考虑原始函数具有非光滑复合结构、对偶函数具有指数$\theta \in [0,1)$的Kurdyka-Łojasiewicz(KŁ)性质的设定。我们为光滑PLDA引入了一种新的收敛性分析框架,其关键组成部分是我们新发展的非光滑原始误差界和对偶误差界性质。利用该框架,我们证明光滑PLDA可在$\mathcal{O}(\epsilon^{-2\max\{2\theta,1\}})$次迭代内找到所关注问题的$\epsilon$-博弈驻点和$\epsilon$-优化驻点。此外,当$\theta \in [0,1/2]$时,光滑PLDA达到最优迭代复杂度$\mathcal{O}(\epsilon^{-2})$。为进一步展示我们分析框架的有效性和广泛适用性,我们证明在温和假设下,特定最大结构问题具有指数$\theta=0$的KŁ性质。作为副产品,我们建立了不同驻点概念间与算法无关的定量关系,这可能具有独立的研究价值。