Federated Learning (FL) has been gaining popularity as a collaborative learning framework to train deep learning-based object detection models over a distributed population of clients. Despite its advantages, FL is vulnerable to model hijacking. The attacker can control how the object detection system should misbehave by implanting Trojaned gradients using only a small number of compromised clients in the collaborative learning process. This paper introduces STDLens, a principled approach to safeguarding FL against such attacks. We first investigate existing mitigation mechanisms and analyze their failures caused by the inherent errors in spatial clustering analysis on gradients. Based on the insights, we introduce a three-tier forensic framework to identify and expel Trojaned gradients and reclaim the performance over the course of FL. We consider three types of adaptive attacks and demonstrate the robustness of STDLens against advanced adversaries. Extensive experiments show that STDLens can protect FL against different model hijacking attacks and outperform existing methods in identifying and removing Trojaned gradients with significantly higher precision and much lower false-positive rates.
翻译:联邦学习作为一种协作学习框架,在基于分布式客户端群体的深度学习目标检测模型训练中日益普及。尽管优势显著,但联邦学习易受模型劫持攻击。攻击者可在协作学习过程中仅利用少量被攻陷客户端植入木马梯度,从而操控目标检测系统的异常行为。本文提出STDLens,一种严谨的联邦学习防护方法。我们首先探究现有防御机制,分析其因梯度空间聚类分析固有缺陷而导致的失效原因。基于这些洞见,我们引入三层取证框架,用于在联邦学习过程中识别并驱逐木马梯度以恢复模型性能。我们考虑了三种自适应攻击类型,并证明STDLens对高级攻击者的鲁棒性。大量实验表明,STDLens能保护联邦学习免受多种模型劫持攻击,且在木马梯度识别与清除方面,其精度显著高于现有方法,误报率大幅降低。