Given the distributed nature, detecting and defending against the backdoor attack under federated learning (FL) systems is challenging. In this paper, we observe that the cosine similarity of the last layer's weight between the global model and each local update could be used effectively as an indicator of malicious model updates. Therefore, we propose CosDefense, a cosine-similarity-based attacker detection algorithm. Specifically, under CosDefense, the server calculates the cosine similarity score of the last layer's weight between the global model and each client update, labels malicious clients whose score is much higher than the average, and filters them out of the model aggregation in each round. Compared to existing defense schemes, CosDefense does not require any extra information besides the received model updates to operate and is compatible with client sampling. Experiment results on three real-world datasets demonstrate that CosDefense could provide robust performance under the state-of-the-art FL poisoning attack.
翻译:联邦学习(FL)系统因分布式特性,在检测与防御后门攻击方面面临挑战。本文观察到,全局模型与各本地更新之间最后一层权重的余弦相似度可作为恶意模型更新的有效指示指标。为此,我们提出基于余弦相似度的攻击者检测算法CosDefense。具体而言,在CosDefense框架下,服务器计算全局模型与每个客户端更新之间最后一层权重的余弦相似度分数,将分数显著高于平均值的客户端标记为恶意节点,并在每轮模型聚合中将其过滤排除。与现有防御方案相比,CosDefense无需除接收到的模型更新之外的额外信息即可运行,且兼容客户端采样机制。在三个真实数据集上的实验结果表明,CosDefense能在面对最先进的联邦学习投毒攻击时保持鲁棒性能。