Infighting in the Dark: Multi-Labels Backdoor Attack in Federated Learning

Federated Learning (FL), a privacy-preserving decentralized machine learning framework, has been shown to be vulnerable to backdoor attacks. Current research primarily focuses on the Single-Label Backdoor Attack (SBA), wherein adversaries share a consistent target. However, a critical fact is overlooked: adversaries may be non-cooperative, have distinct targets, and operate independently, which exhibits a more practical scenario called Multi-Label Backdoor Attack (MBA). Unfortunately, prior works are ineffective in MBA scenario since non-cooperative attackers exclude each other. In this work, we conduct an in-depth investigation to uncover the inherent constraints of the exclusion: similar backdoor mappings are constructed for different targets, resulting in conflicts among backdoor functions. To address this limitation, we propose Mirage, the first non-cooperative MBA strategy in FL that allows attackers to inject effective and persistent backdoors into the global model without collusion by constructing in-distribution (ID) backdoor mapping. Specifically, we introduce an adversarial adaptation method to bridge the backdoor features and the target distribution in an ID manner. Additionally, we further leverage a constrained optimization method to ensure the ID mapping survives in the global training dynamics. Extensive evaluations demonstrate that Mirage outperforms various state-of-the-art attacks and bypasses existing defenses, achieving an average ASR greater than 97\% and maintaining over 90\% after 900 rounds. This work aims to alert researchers to this potential threat and inspire the design of effective defense mechanisms. Code has been made open-source.

翻译：联邦学习（Federated Learning, FL）作为一种保护隐私的分布式机器学习框架，已被证明易受后门攻击。当前研究主要集中于单标签后门攻击（Single-Label Backdoor Attack, SBA），即攻击者共享一致的目标标签。然而，一个关键事实被忽视了：攻击者可能非合作、具有不同目标且独立行动，这构成了一个更现实的场景，称为多标签后门攻击（Multi-Label Backdoor Attack, MBA）。遗憾的是，先前的研究在MBA场景中效果有限，因为非合作攻击者会相互排斥。在本工作中，我们进行了深入探究以揭示这种排斥的内在约束：针对不同目标构建的相似后门映射会导致后门函数之间的冲突。为克服这一局限，我们提出了Mirage——首个在FL中无需合谋的非合作MBA策略，该方法通过构建分布内（in-distribution, ID）后门映射，使攻击者能够向全局模型中注入有效且持久的后门。具体而言，我们引入了一种对抗性适应方法，以ID方式桥接后门特征与目标分布。此外，我们进一步利用约束优化方法确保ID映射在全局训练动态中持续有效。大量实验评估表明，Mirage在多种先进攻击方法中表现优异，并能规避现有防御机制，平均攻击成功率（ASR）超过97%，且在900轮训练后仍保持在90%以上。本工作旨在警示研究者注意这一潜在威胁，并启发有效防御机制的设计。代码已开源。