Minimax problems have achieved success in machine learning such as adversarial training, robust optimization, reinforcement learning. For theoretical analysis, current optimal excess risk bounds, which are composed by generalization error and optimization error, present 1/n-rates in strongly-convex-strongly-concave (SC-SC) settings. Existing studies mainly focus on minimax problems with specific algorithms for optimization error, with only a few studies on generalization performance, which limit better excess risk bounds. In this paper, we study the generalization bounds measured by the gradients of primal functions using uniform localized convergence. We obtain a sharper high probability generalization error bound for nonconvex-strongly-concave (NC-SC) stochastic minimax problems. Furthermore, we provide dimension-independent results under Polyak-Lojasiewicz condition for the outer layer. Based on our generalization error bound, we analyze some popular algorithms such as empirical saddle point (ESP), gradient descent ascent (GDA) and stochastic gradient descent ascent (SGDA). We derive better excess primal risk bounds with further reasonable assumptions, which, to the best of our knowledge, are n times faster than exist results in minimax problems.
翻译:极小极大问题在对抗训练、鲁棒优化、强化学习等机器学习领域已取得成功。在理论分析方面,当前由泛化误差与优化误差构成的最优超额风险界在强凸-强凹(SC-SC)设定下呈现1/n速率。现有研究主要关注采用特定算法处理优化误差的极小极大问题,仅少数研究探讨泛化性能,这限制了获得更优超额风险界的可能。本文通过均匀局部收敛性,以原始函数梯度为度量研究泛化界。针对非凸-强凹(NC-SC)随机极小极大问题,我们获得了更尖锐的高概率泛化误差界。此外,在外层满足Polyak-Lojasiewicz条件时,我们给出了与维度无关的结果。基于所得泛化误差界,我们分析了经验鞍点(ESP)、梯度下降上升(GDA)及随机梯度下降上升(SGDA)等常用算法。在进一步合理假设下,我们推导出更优的原始超额风险界——据我们所知,该结果在极小极大问题中比现有结论快n倍。