Minimax optimization has seen a surge in interest with the advent of modern applications such as GANs, and it is inherently more challenging than simple minimization. The difficulty is exacerbated by the training data residing at multiple edge devices or \textit{clients}, especially when these clients can have heterogeneous datasets and local computation capabilities. We propose a general federated minimax optimization framework that subsumes such settings and several existing methods like Local SGDA. We show that naive aggregation of heterogeneous local progress results in optimizing a mismatched objective function -- a phenomenon previously observed in standard federated minimization. To fix this problem, we propose normalizing the client updates by the number of local steps undertaken between successive communication rounds. We analyze the convergence of the proposed algorithm for classes of nonconvex-concave and nonconvex-nonconcave functions and characterize the impact of heterogeneous client data, partial client participation, and heterogeneous local computations. Our analysis works under more general assumptions on the intra-client noise and inter-client heterogeneity than so far considered in the literature. For all the function classes considered, we significantly improve the existing computation and communication complexity results. Experimental results support our theoretical claims.
翻译:极小极大优化随着生成对抗网络等现代应用的出现而受到广泛关注,且其本质上比单纯的最小化更具挑战性。当训练数据分布在多个边缘设备(即客户端)上时,这一难度会进一步加剧,尤其是当这些客户端拥有异构数据集和本地计算能力时。我们提出了一种通用的联邦极小极大优化框架,该框架涵盖此类设置以及局部SGDA等现有方法。研究表明,对异构局部进度的朴素聚合会导致优化目标函数失配——这一现象此前已在标准联邦最小化中被观察到。为解决此问题,我们提出通过相邻通信轮次之间执行的局部步数对客户端更新进行归一化。我们分析了所提算法在非凸-凹和非凸-非凹函数类上的收敛性,并刻画了客户端数据异质性、部分客户端参与以及异构本地计算的影响。我们的分析基于比现有文献更通用的客户端内部噪声和客户端间异质性假设。对于所有考虑的函数类,我们显著改进了现有的计算和通信复杂度结果。实验结果支持了我们的理论主张。