This paper deals with federated learning (FL) in the presence of malicious Byzantine attacks and data heterogeneity. A novel Robust Average Gradient Algorithm (RAGA) is proposed, which leverages the geometric median for aggregation and can freely select the round number for local updating. Different from most existing resilient approaches, which perform convergence analysis based on strongly-convex loss function or homogeneously distributed dataset, we conduct convergence analysis for not only strongly-convex but also non-convex loss function over heterogeneous dataset. According to our theoretical analysis, as long as the fraction of dataset from malicious users is less than half, RAGA can achieve convergence at rate $\mathcal{O}({1}/{T^{2/3- \delta}})$ where $T$ is the iteration number and $\delta \in (0, 2/3)$ for non-convex loss function, and at linear rate for strongly-convex loss function. Moreover, stationary point or global optimal solution is proved to obtainable as data heterogeneity vanishes. Experimental results corroborate the robustness of RAGA to Byzantine attacks and verifies the advantage of RAGA over baselines on convergence performance under various intensity of Byzantine attacks, for heterogeneous dataset.
翻译:本文研究存在恶意拜占庭攻击和数据异构情况下的联邦学习。我们提出了一种新颖的鲁棒平均梯度算法(RAGA),该算法利用几何中位数进行聚合,并可以自由选择本地更新的轮次。与大多数基于强凸损失函数或同质分布数据集的收敛分析的现有鲁棒方法不同,我们对异构数据集上的强凸和非凸损失函数均进行了收敛性分析。根据我们的理论分析,只要恶意用户的数据集比例小于一半,RAGA对于非凸损失函数能以$\mathcal{O}({1}/{T^{2/3- \delta}})$的速率收敛(其中$T$为迭代次数,$\delta \in (0, 2/3)$),对于强凸损失函数则以线性速率收敛。此外,当数据异构性消失时,可证明能获得驻点或全局最优解。实验结果表明,在异构数据集上,RAGA对拜占庭攻击具有鲁棒性,并且在不同拜占庭攻击强度下,其收敛性能优于基线方法。