Minimax optimization has seen a surge in interest with the advent of modern applications such as GANs, and it is inherently more challenging than simple minimization. The difficulty is exacerbated by the training data residing at multiple edge devices or \textit{clients}, especially when these clients can have heterogeneous datasets and local computation capabilities. We propose a general federated minimax optimization framework that subsumes such settings and several existing methods like Local SGDA. We show that naive aggregation of heterogeneous local progress results in optimizing a mismatched objective function -- a phenomenon previously observed in standard federated minimization. To fix this problem, we propose normalizing the client updates by the number of local steps undertaken between successive communication rounds. We analyze the convergence of the proposed algorithm for classes of nonconvex-concave and nonconvex-nonconcave functions and characterize the impact of heterogeneous client data, partial client participation, and heterogeneous local computations. Our analysis works under more general assumptions on the intra-client noise and inter-client heterogeneity than so far considered in the literature. For all the function classes considered, we significantly improve the existing computation and communication complexity results. Experimental results support our theoretical claims.
翻译:极小极大优化随着生成对抗网络等现代应用的出现而备受关注,且其本质上比简单的极小化更具挑战性。当训练数据分布在多个边缘设备或客户端上时,尤其是这些客户端可能拥有异构数据集和本地计算能力时,这一困难会进一步加剧。我们提出了一个通用的联邦极小极大优化框架,该框架涵盖了此类设置以及局部随机梯度下降对抗等若干现有方法。研究表明,异质本地进展的简单聚合会导致优化不匹配的目标函数——这一现象在标准联邦极小化中已有观察。为解决此问题,我们提出通过归一化客户端在相邻通信轮次间执行的本地步数来调整其更新。我们分析了所提算法在非凸-凹和非凸-非凹函数类下的收敛性,并刻画了异构客户端数据、部分客户端参与以及异构本地计算的影响。我们的分析基于比现有文献更一般的客户端内噪声和客户端间异质性假设。对于所有考虑的函数类,我们显著改进了现有的计算与通信复杂度结果。实验验证了理论结论。