Federated Learning (FL), a privacy-preserving decentralized machine learning framework, has been shown to be vulnerable to backdoor attacks. Current research primarily focuses on the Single-Label Backdoor Attack (SBA), wherein adversaries share a consistent target. However, a critical fact is overlooked: adversaries may be non-cooperative, have distinct targets, and operate independently, which exhibits a more practical scenario called Multi-Label Backdoor Attack (MBA). Unfortunately, prior works are ineffective in MBA scenario since non-cooperative attackers exclude each other. In this work, we conduct an in-depth investigation to uncover the inherent constraints of the exclusion: similar backdoor mappings are constructed for different targets, resulting in conflicts among backdoor functions. To address this limitation, we propose Mirage, the first non-cooperative MBA strategy in FL that allows attackers to inject effective and persistent backdoors into the global model without collusion by constructing in-distribution (ID) backdoor mapping. Specifically, we introduce an adversarial adaptation method to bridge the backdoor features and the target distribution in an ID manner. Additionally, we further leverage a constrained optimization method to ensure the ID mapping survives in the global training dynamics. Extensive evaluations demonstrate that Mirage outperforms various state-of-the-art attacks and bypasses existing defenses, achieving an average ASR greater than 97\% and maintaining over 90\% after 900 rounds. This work aims to alert researchers to this potential threat and inspire the design of effective defense mechanisms. Code has been made open-source.
翻译:联邦学习(Federated Learning, FL)作为一种保护隐私的分布式机器学习框架,已被证明易受后门攻击。当前研究主要集中于单标签后门攻击(Single-Label Backdoor Attack, SBA),即攻击者共享一致的目标标签。然而,一个关键事实被忽视了:攻击者可能非合作、具有不同目标且独立行动,这构成了一个更现实的场景,称为多标签后门攻击(Multi-Label Backdoor Attack, MBA)。遗憾的是,先前的研究在MBA场景中效果有限,因为非合作攻击者会相互排斥。在本工作中,我们进行了深入探究以揭示这种排斥的内在约束:针对不同目标构建的相似后门映射会导致后门函数之间的冲突。为克服这一局限,我们提出了Mirage——首个在FL中无需合谋的非合作MBA策略,该方法通过构建分布内(in-distribution, ID)后门映射,使攻击者能够向全局模型中注入有效且持久的后门。具体而言,我们引入了一种对抗性适应方法,以ID方式桥接后门特征与目标分布。此外,我们进一步利用约束优化方法确保ID映射在全局训练动态中持续有效。大量实验评估表明,Mirage在多种先进攻击方法中表现优异,并能规避现有防御机制,平均攻击成功率(ASR)超过97%,且在900轮训练后仍保持在90%以上。本工作旨在警示研究者注意这一潜在威胁,并启发有效防御机制的设计。代码已开源。