This paper puts forth a new training data-untethered model poisoning (MP) attack on federated learning (FL). The new MP attack extends an adversarial variational graph autoencoder (VGAE) to create malicious local models based solely on the benign local models overheard without any access to the training data of FL. Such an advancement leads to the VGAE-MP attack that is not only efficacious but also remains elusive to detection. VGAE-MP attack extracts graph structural correlations among the benign local models and the training data features, adversarially regenerates the graph structure, and generates malicious local models using the adversarial graph structure and benign models' features. Moreover, a new attacking algorithm is presented to train the malicious local models using VGAE and sub-gradient descent, while enabling an optimal selection of the benign local models for training the VGAE. Experiments demonstrate a gradual drop in FL accuracy under the proposed VGAE-MP attack and the ineffectiveness of existing defense mechanisms in detecting the attack, posing a severe threat to FL.
翻译:本文提出了一种针对联邦学习(FL)的新型无训练数据模型投毒(MP)攻击方法。该攻击扩展了对抗性变分图自编码器(VGAE),仅基于监听到的良性局部模型(无需访问FL训练数据)即可生成恶意局部模型。这一突破使得VGAE-MP攻击不仅高效,且难以被现有检测机制识别。VGAE-MP攻击通过提取良性局部模型与训练数据特征间的图结构关联,对抗性重构图结构,并利用对抗图结构与良性模型特征生成恶意局部模型。此外,本文提出了一种新的攻击算法,通过VGAE与次梯度下降法训练恶意局部模型,同时实现对训练VGAE的良性局部模型的最优选择。实验表明,在提出的VGAE-MP攻击下,FL准确率逐步下降,而现有防御机制无法有效检测该攻击,对联邦学习构成了严重威胁。