Federated learning enables training high-utility models across several clients without directly sharing their private data. As a downside, the federated setting makes the model vulnerable to various adversarial attacks in the presence of malicious clients. Despite the theoretical and empirical success in defending against attacks that aim to degrade models' utility, defense against backdoor attacks that increase model accuracy on backdoor samples exclusively without hurting the utility on other samples remains challenging. To this end, we first analyze the failure modes of existing defenses over a flat loss landscape, which is common for well-designed neural networks such as Resnet [He et al., 2015] but is often overlooked by previous works. Then, we propose an invariant aggregator that redirects the aggregated update to invariant directions that are generally useful via selectively masking out the update elements that favor few and possibly malicious clients. Theoretical results suggest that our approach provably mitigates backdoor attacks and remains effective over flat loss landscapes. Empirical results on three datasets with different modalities and varying numbers of clients further demonstrate that our approach mitigates a broad class of backdoor attacks with a negligible cost on the model utility.
翻译:联邦学习能够在多个客户端之间训练高实用性的模型,而无需直接共享他们的私有数据。然而,联邦设置的一个缺点是,在存在恶意客户端的情况下,模型容易受到各种对抗性攻击。尽管在防御旨在降低模型实用性的攻击方面取得了理论和实证上的成功,但防御后门攻击(这类攻击专门提高模型在后门样本上的准确性,而不损害其他样本上的实用性)仍然具有挑战性。为此,我们首先分析了现有防御在平坦损失景观(这是诸如Resnet [He等人,2015]等设计良好的神经网络的常见特征,但常常被先前的工作所忽视)上的失效模式。然后,我们提出了一种不变聚合器,通过选择性地屏蔽那些有利于少数(可能恶意)客户端的更新元素,将聚合后的更新重定向到通常有用的不变方向。理论结果表明,我们的方法能够有保证地缓解后门攻击,并且在平坦损失景观上仍然有效。在三种不同模态和不同数量客户端的数据集上的实证结果进一步表明,我们的方法以可忽略的模型实用性成本缓解了广泛的后门攻击。