This paper takes an initial step to systematically investigate the generalization bounds of algorithms for solving nonconvex-(strongly)-concave (NC-SC/NC-C) stochastic minimax optimization measured by the stationarity of primal functions. We first establish algorithm-agnostic generalization bounds via uniform convergence between the empirical minimax problem and the population minimax problem. The sample complexities for achieving $\epsilon$-generalization are $\tilde{\mathcal{O}}(d\kappa^2\epsilon^{-2})$ and $\tilde{\mathcal{O}}(d\epsilon^{-4})$ for NC-SC and NC-C settings, respectively, where $d$ is the dimension and $\kappa$ is the condition number. We further study the algorithm-dependent generalization bounds via stability arguments of algorithms. In particular, we introduce a novel stability notion for minimax problems and build a connection between generalization bounds and the stability notion. As a result, we establish algorithm-dependent generalization bounds for stochastic gradient descent ascent (SGDA) algorithm and the more general sampling-determined algorithms.
翻译:本文首次系统研究了求解非凸-(强)-凹(NC-SC/NC-C)随机极小极大优化的算法的泛化界,该泛化界通过原始函数的平稳性进行度量。我们首先通过经验极小极大问题与总体极小极大问题之间的一致收敛性建立了与算法无关的泛化界。在NC-SC和NC-C设置下,达到$\epsilon$-泛化的样本复杂度分别为$\tilde{\mathcal{O}}(d\kappa^2\epsilon^{-2})$和$\tilde{\mathcal{O}}(d\epsilon^{-4})$,其中$d$是维度,$\kappa$是条件数。我们进一步通过算法的稳定性论证研究了依赖于算法的泛化界。特别地,我们引入了极小极大问题的一种新的稳定性概念,并建立了泛化界与稳定性概念之间的联系。由此,我们为随机梯度下降上升(SGDA)算法以及更一般的采样确定性算法建立了依赖于算法的泛化界。