Federated Learning (FL) enables multiple users to collaboratively train a global model in a distributed manner without revealing their personal data. However, FL remains vulnerable to model poisoning attacks, where malicious actors inject crafted updates to compromise the global model's accuracy. These vulnerabilities are particularly severe in non-homogeneous environments, where clients exhibit varying proportions of class labels, resulting in heterogeneous updates. In such settings, benign outliers are often misclassified as false positives, while maliciously crafted uploads evade detection and are aggregated at the server. Existing defense mechanisms struggle in such real-world settings, resulting in significant declines in the global FL model's performance. We propose a novel defense mechanism, Kernel-based Trust Segmentation (KeTS), to counter model poisoning attacks. Unlike existing approaches, KeTS analyzes the evolution of each client's updates and effectively segments malicious clients using Kernel Density Estimation (KDE), even in the presence of benign outliers. We thoroughly evaluate KeTS's performance against the six most effective model poisoning attacks (i.e., Trim-Attack, Krum-Attack, Min-Max attack, Min-Sum attack, and their variants) on two different datasets (i.e., MNIST and Fashion-MNIST) and compare its performance with three classical robust schemes (i.e., Krum, Trim-Mean, and Median) and a state-of-the-art defense (i.e., FLTrust). Our results show that KeTS outperforms the existing defenses in every attack setting; beating the best-performing defense by an overall average of >24% (on MNIST) and >14% (on Fashion-MNIST). A series of further experiments (varying poisoning approaches, attacker population, etc.) reveal the consistent and superior performance of KeTS under diverse conditions.
翻译:联邦学习(Federated Learning, FL)允许多个用户在分布式环境下协同训练全局模型,而无需公开其个人数据。然而,FL 仍然容易受到模型投毒攻击,恶意参与者通过注入精心构造的更新来破坏全局模型的准确性。这些漏洞在非均匀环境中尤为严重,因为客户端表现出不同比例的类别标签,导致更新具有异构性。在此类场景中,良性异常值常被误判为假阳性,而恶意构造的上传则逃避检测并在服务器端被聚合。现有防御机制在此类真实场景中表现不佳,导致全局 FL 模型性能显著下降。我们提出了一种新颖的防御机制——基于核的信任分割(Kernel-based Trust Segmentation, KeTS),以应对模型投毒攻击。与现有方法不同,KeTS 分析每个客户端更新的演变过程,并利用核密度估计(Kernel Density Estimation, KDE)有效分割恶意客户端,即使在存在良性异常值的情况下也能实现。我们在两个不同数据集(即 MNIST 和 Fashion-MNIST)上,针对六种最有效的模型投毒攻击(即 Trim-Attack、Krum-Attack、Min-Max 攻击、Min-Sum 攻击及其变体)全面评估了 KeTS 的性能,并将其与三种经典鲁棒方案(即 Krum、Trim-Mean 和 Median)以及一种前沿防御方法(即 FLTrust)进行了比较。结果表明,KeTS 在所有攻击场景下均优于现有防御方法;在整体平均性能上,比表现最佳的防御方法分别高出 >24%(在 MNIST 上)和 >14%(在 Fashion-MNIST 上)。一系列进一步实验(改变投毒方法、攻击者比例等)表明,KeTS 在不同条件下均表现出稳定且卓越的性能。