Federated Learning (FL) enables distributed model training on edge devices while preserving data privacy. However, clients tend to have non-Independent and Identically Distributed (non-IID) data, which often leads to client-drift, and therefore diminishing convergence speed and model performance. While adaptive optimizers have been proposed to mitigate these effects, they frequently introduce computational complexity or communication overhead unsuitable for resource-constrained IoT environments. This paper introduces Federated Zero Mean Gradients (FedZMG), a novel, parameter-free, client-side optimization algorithm designed to tackle client-drift by structurally regularizing the optimization space. Advancing the idea of Gradient Centralization, FedZMG projects local gradients onto a zero-mean hyperplane, effectively neutralizing the "intensity" or "bias" shifts inherent in heterogeneous data distributions without requiring additional communication or hyperparameter tuning. A theoretical analysis is provided, proving that FedZMG reduces the effective gradient variance and guarantees tighter convergence bounds compared to standard FedAvg. Extensive empirical evaluations on EMNIST, CIFAR100, and Shakespeare datasets demonstrate that FedZMG achieves better convergence speed and final validation accuracy compared to the baseline FedAvg and the adaptive optimizer FedAdam, particularly in highly non-IID settings.
翻译:联邦学习(FL)能够在保护数据隐私的前提下,在边缘设备上进行分布式模型训练。然而,客户端数据往往呈现非独立同分布(non-IID)特性,这通常会导致客户端漂移,从而降低收敛速度与模型性能。尽管已有自适应优化器被提出以缓解此类影响,但它们常会引入计算复杂度或通信开销,不适用于资源受限的物联网环境。本文提出联邦零均值梯度(FedZMG),一种新颖、无参数、客户端侧的优化算法,旨在通过结构化地正则化优化空间来解决客户端漂移问题。FedZMG 在梯度中心化思想的基础上,将局部梯度投影至零均值超平面,从而有效消除异构数据分布中固有的“强度”或“偏差”偏移,且无需额外通信或超参数调优。本文提供了理论分析,证明相较于标准 FedAvg,FedZMG 能够降低有效梯度方差并保证更紧的收敛界。在 EMNIST、CIFAR100 和 Shakespeare 数据集上的大量实验评估表明,FedZMG 相比基线 FedAvg 和自适应优化器 FedAdam,能实现更优的收敛速度与最终验证准确率,尤其在高度非独立同分布的场景中。