Federated learning (FL) enables multiple clients to collaboratively train models without sharing their local data, and becomes an important privacy-preserving machine learning framework. However, classical FL faces serious security and robustness problem, e.g., malicious clients can poison model updates and at the same time claim large quantities to amplify the impact of their model updates in the model aggregation. Existing defense methods for FL, while all handling malicious model updates, either treat all quantities benign or simply ignore/truncate the quantities of all clients. The former is vulnerable to quantity-enhanced attack, while the latter leads to sub-optimal performance since the local data on different clients is usually in significantly different sizes. In this paper, we propose a robust quantity-aware aggregation algorithm for federated learning, called FedRA, to perform the aggregation with awareness of local data quantities while being able to defend against quantity-enhanced attacks. More specifically, we propose a method to filter malicious clients by jointly considering the uploaded model updates and data quantities from different clients, and performing quantity-aware weighted averaging on model updates from remaining clients. Moreover, as the number of malicious clients participating in the federated learning may dynamically change in different rounds, we also propose a malicious client number estimator to predict how many suspicious clients should be filtered in each round. Experiments on four public datasets demonstrate the effectiveness of our FedRA method in defending FL against quantity-enhanced attacks.
翻译:联邦学习(FL)使多个客户端能够在不共享本地数据的情况下协同训练模型,成为一种重要的隐私保护机器学习框架。然而,经典FL面临严重的安全与鲁棒性问题,例如恶意客户端可投毒模型更新,同时声称大量数据以放大其模型更新在聚合中的影响。现有FL防御方法在处理恶意模型更新时,要么将所有数据量视为良性,要么简单忽略或截断所有客户端的数据量。前者易受数据量增强攻击,后者则因不同客户端本地数据规模通常差异显著而导致性能次优。本文提出一种面向联邦学习的鲁棒数量感知聚合算法FedRA,能够在感知本地数据量的同时防御数据量增强攻击。具体而言,我们提出一种方法,通过联合考虑不同客户端上传的模型更新与数据量来过滤恶意客户端,并对剩余客户端的模型更新执行数量感知加权平均。此外,由于参与联邦学习的恶意客户端数量在不同轮次中可能动态变化,我们还设计了一个恶意客户端数量估计器,用于预测每轮应过滤的可疑客户端数量。在四个公开数据集上的实验证明了FedRA方法在防御FL免受数据量增强攻击方面的有效性。