Federated Learning (FL) has been gaining popularity as a collaborative learning framework to train deep learning-based object detection models over a distributed population of clients. Despite its advantages, FL is vulnerable to model hijacking. The attacker can control how the object detection system should misbehave by implanting Trojaned gradients using only a small number of compromised clients in the collaborative learning process. This paper introduces STDLens, a principled approach to safeguarding FL against such attacks. We first investigate existing mitigation mechanisms and analyze their failures caused by the inherent errors in spatial clustering analysis on gradients. Based on the insights, we introduce a three-tier forensic framework to identify and expel Trojaned gradients and reclaim the performance over the course of FL. We consider three types of adaptive attacks and demonstrate the robustness of STDLens against advanced adversaries. Extensive experiments show that STDLens can protect FL against different model hijacking attacks and outperform existing methods in identifying and removing Trojaned gradients with significantly higher precision and much lower false-positive rates.
翻译:联邦学习(Federated Learning, FL)作为一种协作学习框架,用于在分布式客户端群体上训练基于深度学习的目标检测模型,近年来日益普及。尽管具有诸多优势,联邦学习易受模型劫持攻击。攻击者仅需在协作学习过程中控制少量受损客户端,通过植入特洛伊梯度即可操控目标检测系统的异常行为。本文提出STDLens,一种原则性方法,用于保护联邦学习免受此类攻击。我们首先研究现有防御机制,分析其因梯度空间聚类分析中固有错误而失效的原因。基于这些洞察,我们引入三层取证框架,用于识别并驱逐特洛伊梯度,在联邦学习全过程恢复模型性能。我们考虑了三种自适应攻击类型,并验证了STDLens对抗高级对手的鲁棒性。大量实验表明,STDLens能保护联邦学习免受不同模型劫持攻击,且在识别与移除特洛伊梯度方面显著优于现有方法,具有更高的精确率和更低的误报率。