Given the distributed nature, detecting and defending against the backdoor attack under federated learning (FL) systems is challenging. In this paper, we observe that the cosine similarity of the last layer's weight between the global model and each local update could be used effectively as an indicator of malicious model updates. Therefore, we propose CosDefense, a cosine-similarity-based attacker detection algorithm. Specifically, under CosDefense, the server calculates the cosine similarity score of the last layer's weight between the global model and each client update, labels malicious clients whose score is much higher than the average, and filters them out of the model aggregation in each round. Compared to existing defense schemes, CosDefense does not require any extra information besides the received model updates to operate and is compatible with client sampling. Experiment results on three real-world datasets demonstrate that CosDefense could provide robust performance under the state-of-the-art FL poisoning attack.
翻译:鉴于联邦学习系统的分布式特性,在其中检测并防御后门攻击颇具挑战性。本文观察到,全局模型与各局部更新在最后一层权重上的余弦相似度可有效作为恶意模型更新的指标。为此,我们提出一种基于余弦相似度的攻击者检测算法CosDefense。具体而言,在CosDefense机制下,服务器计算全局模型与各客户端更新在最后一层权重上的余弦相似度分数,将分数远高于平均值的客户端标记为恶意节点,并在每轮模型聚合中将其过滤排除。与现有防御方案相比,CosDefense无需除接收到的模型更新外的任何额外信息即可运行,且兼容客户端采样机制。在三个真实世界数据集上的实验结果表明,CosDefense能在最先进的联邦学习投毒攻击下保持鲁棒性能。