Federated Averaging (FedAvg) and its variants are the most popular optimization algorithms in federated learning (FL). Previous convergence analyses of FedAvg either assume full client participation or partial client participation where the clients can be uniformly sampled. However, in practical cross-device FL systems, only a subset of clients that satisfy local criteria such as battery status, network connectivity, and maximum participation frequency requirements (to ensure privacy) are available for training at a given time. As a result, client availability follows a natural cyclic pattern. We provide (to our knowledge) the first theoretical framework to analyze the convergence of FedAvg with cyclic client participation with several different client optimizers such as GD, SGD, and shuffled SGD. Our analysis discovers that cyclic client participation can achieve a faster asymptotic convergence rate than vanilla FedAvg with uniform client participation under suitable conditions, providing valuable insights into the design of client sampling protocols.
翻译:联邦平均法(FedAvg)及其变体是联邦学习中最流行的优化算法。以往对FedAvg的收敛性分析要么假设所有客户端参与,要么假设客户端可均匀采样的部分参与情况。然而,在实际跨设备联邦学习系统中,只有满足本地条件(如电池状态、网络连接和为保证隐私而设置的最大参与频率要求)的客户端子集在特定时间可用于训练。因此,客户端可用性呈现自然的循环模式。我们首次(据我们所知)提出了理论框架,用于分析具有循环客户端参与的FedAvg在多种客户端优化器(如GD、SGD和混洗SGD)下的收敛性。我们的分析发现,在适当条件下,循环客户端参与比标准均匀客户端参与的FedAvg能实现更快的渐近收敛速率,这为客户端采样协议的设计提供了宝贵见解。