Wireless federated learning (WFL) suffers from heterogeneity prevailing in the data distributions, computing powers, and channel conditions of participating devices. This paper presents a new Federated Learning with Adjusted leaRning ratE (FLARE) framework to mitigate the impact of the heterogeneity. The key idea is to allow the participating devices to adjust their individual learning rates and local training iterations, adapting to their instantaneous computing powers. The convergence upper bound of FLARE is established rigorously under a general setting with non-convex models in the presence of non-i.i.d. datasets and imbalanced computing powers. By minimizing the upper bound, we further optimize the scheduling of FLARE to exploit the channel heterogeneity. A nested problem structure is revealed to facilitate iteratively allocating the bandwidth with binary search and selecting devices with a new greedy method. A linear problem structure is also identified and a low-complexity linear programming scheduling policy is designed when training models have large Lipschitz constants. Experiments demonstrate that FLARE consistently outperforms the baselines in test accuracy, and converges much faster with the proposed scheduling policy.
翻译:无线联邦学习(WFL)面临各参与设备在数据分布、计算能力和信道条件方面普遍存在的异构性问题。本文提出了一种全新的具有可调节学习率的联邦学习(FLARE)框架,以减轻异构性带来的影响。其核心思想是允许参与设备根据其实时计算能力,自主调整个体学习率和本地训练轮次。在非凸模型、非独立同分布数据集及非均衡计算能力的通用设定下,严格建立了FLARE的收敛上界。通过最小化该上界,我们进一步优化FLARE的调度策略以利用信道异构性。揭示了嵌套问题结构,从而能够通过二分搜索迭代分配带宽,并采用新型贪心方法选择设备。同时识别出线性问题结构,针对具有较大Lipschitz常数的训练模型,设计了一种低复杂度的线性规划调度策略。实验表明,FLARE在测试准确率上持续优于基线方法,且采用所提出的调度策略后收敛速度显著加快。