We propose FedGT, a novel framework for identifying malicious clients in federated learning with secure aggregation. Inspired by group testing, the framework leverages overlapping groups of clients to identify the presence of malicious clients in the groups via a decoding operation. The clients identified as malicious are then removed from the training of the model, which is performed over the remaining clients. By choosing the size, number, and overlap between groups, FedGT strikes a balance between privacy and security. Specifically, the server learns the aggregated model of the clients in each group - vanilla federated learning and secure aggregation correspond to the extreme cases of FedGT with group size equal to one and the total number of clients, respectively. The effectiveness of FedGT is demonstrated through extensive experiments on the MNIST, CIFAR-10, and ISIC2019 datasets in a cross-silo setting under different data-poisoning attacks. These experiments showcase FedGT's ability to identify malicious clients, resulting in high model utility. We further show that FedGT significantly outperforms the private robust aggregation approach based on the geometric median recently proposed by Pillutla et al. on heterogeneous client data (ISIC2019) and in the presence of targeted attacks (CIFAR-10 and ISIC2019).
翻译:我们提出FedGT——一种在联邦学习安全聚合场景下识别恶意客户端的新型框架。该框架受群体测试启发,通过构建客户端重叠分组,并利用解码操作识别各分组中是否存在恶意客户端。被识别为恶意的客户端将从模型训练中移除,训练过程仅在剩余客户端上进行。通过调整分组规模、分组数量及分组间重叠度,FedGT能够在隐私保护与安全性之间取得平衡。具体而言,服务器可获取每组客户端的聚合模型——当分组规模为1时对应原始联邦学习,分组规模等于总客户端数时则对应安全聚合。我们在跨孤岛场景下,针对MNIST、CIFAR-10和ISIC2019数据集展开不同数据投毒攻击实验,验证了FedGT的有效性。实验结果表明,FedGT能够准确识别恶意客户端,从而保持较高的模型效用。进一步分析显示,在异构客户端数据(ISIC2019)及定向攻击场景(CIFAR-10与ISIC2019)中,FedGT的性能显著优于Pillutla等人近期提出的基于几何中位数的隐私鲁棒聚合方法。