As a distributed machine learning paradigm, Federated Learning (FL) enables large-scale clients to collaboratively train a model without sharing their raw data. However, due to the lack of data auditing for untrusted clients, FL is vulnerable to poisoning attacks, especially backdoor attacks. By using poisoned data for local training or directly changing the model parameters, attackers can easily inject backdoors into the model, which can trigger the model to make misclassification of targeted patterns in images. To address these issues, we propose a novel data-free trigger-generation-based defense approach based on the two characteristics of backdoor attacks: i) triggers are learned faster than normal knowledge, and ii) trigger patterns have a greater effect on image classification than normal class patterns. Our approach generates the images with newly learned knowledge by identifying the differences between the old and new global models, and filters trigger images by evaluating the effect of these generated images. By using these trigger images, our approach eliminates poisoned models to ensure the updated global model is benign. Comprehensive experiments demonstrate that our approach can defend against almost all the existing types of backdoor attacks and outperform all the seven state-of-the-art defense methods with both IID and non-IID scenarios. Especially, our approach can successfully defend against the backdoor attack even when 80\% of the clients are malicious.
翻译:作为一种分布式机器学习范式,联邦学习(FL)使得大规模客户端能够在无需共享原始数据的情况下协同训练模型。然而,由于无法对不可信客户端进行数据审计,FL容易遭受投毒攻击,尤其是后门攻击。攻击者可通过使用中毒数据进行本地训练或直接修改模型参数,轻易向模型中注入后门,从而触发模型对图像中目标模式产生误分类。为解决这些问题,我们基于后门攻击的两个特性提出了一种新颖的无数据触发生成防御方法:其一,后门触发器的学习速度优于正常知识;其二,触发模式对图像分类的影响大于正常类别模式。该方法通过识别新旧全局模型之间的差异,生成包含新习得知识的图像,并评估这些生成图像的影响以过滤触发器图像。通过使用这些触发器图像,该方法消除中毒模型,确保更新的全局模型保持良性。大量实验表明,本方法能够防御几乎所有现有类型的后门攻击,并在独立同分布(IID)与非独立同分布(non-IID)场景下均优于七种最新防御方法。特别地,即使当80%的客户端为恶意节点时,该方法仍能成功防御后门攻击。