Federated learning enables training high-utility models across several clients without directly sharing their private data. As a downside, the federated setting makes the model vulnerable to various adversarial attacks in the presence of malicious clients. Despite the theoretical and empirical success in defending against attacks that aim to degrade models' utility, defense against backdoor attacks that increase model accuracy on backdoor samples exclusively without hurting the utility on other samples remains challenging. To this end, we first analyze the failure modes of existing defenses over a flat loss landscape, which is common for well-designed neural networks such as Resnet (He et al., 2015) but is often overlooked by previous works. Then, we propose an invariant aggregator that redirects the aggregated update to invariant directions that are generally useful via selectively masking out the update elements that favor few and possibly malicious clients. Theoretical results suggest that our approach provably mitigates backdoor attacks and remains effective over flat loss landscapes. Empirical results on three datasets with different modalities and varying numbers of clients further demonstrate that our approach mitigates a broad class of backdoor attacks with a negligible cost on the model utility.
翻译:联邦学习允许多个客户端在不直接共享私有数据的情况下训练高实用性的模型。然而,联邦设置使得模型在恶意客户端存在时容易受到各种对抗性攻击。尽管在防御旨在降低模型效用的攻击方面取得了理论和实证成功,但防御后门攻击(即在仅提高后门样本上的模型准确率而不损害其他样本效用)仍然具有挑战性。为此,我们首先分析了现有防御在平坦损失景观(这是诸如ResNet等设计良好的神经网络的常见特征,但以往工作常忽略此特性)下的失效模式。随后,我们提出了一种不变聚合器,通过选择性屏蔽仅有利于少数可能恶意客户端的更新元素,将聚合更新重定向至普遍有用的不变方向。理论结果表明,我们的方法可证明减轻后门攻击,并在平坦损失景观下保持有效性。在三个不同模态和不同客户端数量的数据集上的实证结果进一步表明,我们的方法以可忽略的模型效用代价缓解了广泛类型的后门攻击。