Federated Learning (FL) has been gaining popularity as a collaborative learning framework to train deep learning-based object detection models over a distributed population of clients. Despite its advantages, FL is vulnerable to model hijacking. The attacker can control how the object detection system should misbehave by implanting Trojaned gradients using only a small number of compromised clients in the collaborative learning process. This paper introduces STDLens, a principled approach to safeguarding FL against such attacks. We first investigate existing mitigation mechanisms and analyze their failures caused by the inherent errors in spatial clustering analysis on gradients. Based on the insights, we introduce a three-tier forensic framework to identify and expel Trojaned gradients and reclaim the performance over the course of FL. We consider three types of adaptive attacks and demonstrate the robustness of STDLens against advanced adversaries. Extensive experiments show that STDLens can protect FL against different model hijacking attacks and outperform existing methods in identifying and removing Trojaned gradients with significantly higher precision and much lower false-positive rates.
翻译:联邦学习(FL)作为一种协作学习框架,在分布式客户端群体中训练基于深度学习的目标检测模型方面日益普及。尽管其具有诸多优势,但FL容易遭受模型劫持攻击。攻击者只需在协作学习过程中利用少量被攻陷的客户端植入木马梯度,即可控制目标检测系统的错误行为。本文提出STDLens,一种系统化保护FL免受此类攻击的方法。我们首先研究现有防御机制,分析其因梯度空间聚类分析固有误差导致的失败原因。基于这些见解,我们引入一个三层取证框架,用于在FL过程中识别并驱逐木马梯度,恢复模型性能。我们考虑了三种自适应攻击类型,并验证了STDLens在对抗高级对手时的鲁棒性。大量实验表明,STDLens能够保护FL免受不同模型劫持攻击,在识别与移除木马梯度方面显著优于现有方法,具备更高的精确率和极低的误报率。