Fabricated Flips: Poisoning Federated Learning without Data

Attacks on Federated Learning (FL) can severely reduce the quality of the generated models and limit the usefulness of this emerging learning paradigm that enables on-premise decentralized learning. However, existing untargeted attacks are not practical for many scenarios as they assume that i) the attacker knows every update of benign clients, or ii) the attacker has a large dataset to locally train updates imitating benign parties. In this paper, we propose a data-free untargeted attack (DFA) that synthesizes malicious data to craft adversarial models without eavesdropping on the transmission of benign clients at all or requiring a large quantity of task-specific training data. We design two variants of DFA, namely DFA-R and DFA-G, which differ in how they trade off stealthiness and effectiveness. Specifically, DFA-R iteratively optimizes a malicious data layer to minimize the prediction confidence of all outputs of the global model, whereas DFA-G interactively trains a malicious data generator network by steering the output of the global model toward a particular class. Experimental results on Fashion-MNIST, Cifar-10, and SVHN show that DFA, despite requiring fewer assumptions than existing attacks, achieves similar or even higher attack success rate than state-of-the-art untargeted attacks against various state-of-the-art defense mechanisms. Concretely, they can evade all considered defense mechanisms in at least 50% of the cases for CIFAR-10 and often reduce the accuracy by more than a factor of 2. Consequently, we design REFD, a defense specifically crafted to protect against data-free attacks. REFD leverages a reference dataset to detect updates that are biased or have a low confidence. It greatly improves upon existing defenses by filtering out the malicious updates and achieves high global model accuracy

翻译：联邦学习（FL）攻击会严重降低生成模型的质量，并限制这种支持本地化去中心化学习的新兴学习范式的实用性。然而，现有非定向攻击在多数场景中并不实用，因为它们假设攻击者能够获取良性客户端的每一次更新，或拥有大量数据集以本地训练模仿良性方的更新。本文提出一种无数据非定向攻击（DFA），通过合成恶意数据来构造对抗模型，完全无需窃听良性客户端的传输或依赖大量特定任务训练数据。我们设计了两种DFA变体——DFA-R和DFA-G，它们在隐蔽性与有效性之间采用不同的权衡策略。具体而言，DFA-R通过迭代优化恶意数据层，最小化全局模型所有输出的预测置信度；而DFA-G则通过将全局模型输出导向特定类别，交互式训练恶意数据生成器网络。在Fashion-MNIST、CIFAR-10和SVHN上的实验表明，尽管DFA所需的假设条件少于现有攻击方法，但其在针对多种先进防御机制时，攻击成功率与现有最先进非定向攻击相当甚至更高。具体而言，对于CIFAR-10数据集，DFA在超过50%的情况下能规避所有考虑到的防御机制，并常将模型准确率降低超过一半。为此，我们设计了针对性防御机制REFD，用以抵御无数据攻击。REFD利用参考数据集检测存在偏差或低置信度的更新，通过滤除恶意更新显著提升了现有防御方法的性能，并保持了较高的全局模型准确率。