In a federated learning (FL) system, distributed clients upload their local models to a central server to aggregate into a global model. Malicious clients may plant backdoors into the global model through uploading poisoned local models, causing images with specific patterns to be misclassified into some target labels. Backdoors planted by current attacks are not durable, and vanish quickly once the attackers stop model poisoning. In this paper, we investigate the connection between the durability of FL backdoors and the relationships between benign images and poisoned images (i.e., the images whose labels are flipped to the target label during local training). Specifically, benign images with the original and the target labels of the poisoned images are found to have key effects on backdoor durability. Consequently, we propose a novel attack, Chameleon, which utilizes contrastive learning to further amplify such effects towards a more durable backdoor. Extensive experiments demonstrate that Chameleon significantly extends the backdoor lifespan over baselines by $1.2\times \sim 4\times$, for a wide range of image datasets, backdoor types, and model architectures.
翻译:在联邦学习系统中,分布式客户端将本地模型上传至中央服务器,由服务器聚合为全局模型。恶意客户端可能通过上传中毒的本地模型在全局模型中植入后门,导致具有特定模式的图像被错误分类为目标标签。当前攻击植入的后门持久性不足,一旦攻击者停止模型投毒便会迅速消失。本文研究了联邦学习后门持久性与良性图像和中毒图像(即本地训练期间标签被翻转至目标标签的图像)之间的关系。具体而言,发现中毒图像原始标签及目标标签对应的良性图像对后门持久性具有关键影响。据此,我们提出了一种名为Chameleon的新型攻击方法,该方法利用对比学习进一步放大上述效应,以实现更具持久性的后门。大量实验表明,针对各类图像数据集、后门类型及模型架构,Chameleon将后门寿命相较于基线方法显著延长了$1.2\times \sim 4\times$倍。