Federated learning, while being a promising approach for collaborative model training, is susceptible to backdoor attacks due to its decentralized nature. Backdoor attacks have shown remarkable stealthiness, as they compromise model predictions only when inputs contain specific triggers. As a countermeasure, anomaly detection is widely used to filter out backdoor attacks in FL. However, the non-independent and identically distributed (non-IID) data distribution nature of FL clients presents substantial challenges in backdoor attack detection, as the data variety introduces variance among benign models, making them indistinguishable from malicious ones. In this work, we propose a novel distribution-aware backdoor detection mechanism, BoBa, to address this problem. To differentiate outliers arising from data variety versus backdoor attacks, we propose to break down the problem into two steps: clustering clients utilizing their data distribution, and followed by a voting-based detection. We propose a novel data distribution inference mechanism for accurate data distribution estimation. To improve detection robustness, we introduce an overlapping clustering method, where each client is associated with multiple clusters, ensuring that the trustworthiness of a model update is assessed collectively by multiple clusters rather than a single cluster. Through extensive evaluations, we demonstrate that BoBa can reduce the attack success rate to lower than 0.001 while maintaining high main task accuracy across various attack strategies and experimental settings.
翻译:联邦学习作为一种协作式模型训练的有前景方法,因其去中心化特性而易受后门攻击。后门攻击具有显著的隐蔽性,仅在输入包含特定触发器时才会影响模型预测。作为防御措施,异常检测被广泛用于过滤联邦学习中的后门攻击。然而,联邦学习客户端非独立同分布(non-IID)的数据分布特性给后门攻击检测带来了巨大挑战,因为数据多样性导致良性模型间存在差异,使其难以与恶意模型区分。本文提出了一种新颖的基于数据分布感知的后门检测机制BoBa来解决此问题。为区分由数据多样性与后门攻击导致的异常,我们提出将问题分解为两步:首先利用数据分布对客户端进行聚类,随后进行基于投票的检测。我们提出了一种新颖的数据分布推断机制以实现准确的数据分布估计。为提高检测鲁棒性,我们引入了重叠聚类方法,每个客户端与多个聚类关联,确保模型更新的可信度由多个聚类集体评估而非单一聚类决定。通过大量实验评估,我们证明BoBa能够将攻击成功率降至0.001以下,同时在多种攻击策略和实验设置下保持高主任务准确率。