Federated learning allows for clients in a distributed system to jointly train a machine learning model. However, clients' models are vulnerable to attacks during the training and testing phases. In this paper, we address the issue of adversarial clients performing "internal evasion attacks": crafting evasion attacks at test time to deceive other clients. For example, adversaries may aim to deceive spam filters and recommendation systems trained with federated learning for monetary gain. The adversarial clients have extensive information about the victim model in a federated learning setting, as weight information is shared amongst clients. We are the first to characterize the transferability of such internal evasion attacks for different learning methods and analyze the trade-off between model accuracy and robustness depending on the degree of similarities in client data. We show that adversarial training defenses in the federated learning setting only display limited improvements against internal attacks. However, combining adversarial training with personalized federated learning frameworks increases relative internal attack robustness by 60% compared to federated adversarial training and performs well under limited system resources.
翻译:联邦学习允许分布式系统中的客户端联合训练机器学习模型。然而,客户端的模型在训练和测试阶段容易遭受攻击。本文聚焦对抗性客户端实施的"内部规避攻击"问题:即在测试阶段实施规避攻击以欺骗其他客户端。例如,攻击者可能试图欺骗通过联邦学习训练出的垃圾邮件过滤器和推荐系统,以获取经济利益。在联邦学习环境中,由于权重信息在客户端间共享,对抗性客户端能获取受害者模型的全面信息。我们首次针对不同学习方法表征了此类内部规避攻击的可迁移性,并基于客户端数据相似度分析了模型准确率与鲁棒性之间的权衡关系。研究表明,联邦学习环境中的对抗性训练防御机制只能有限提升对内部攻击的防护效果。然而,将对抗性训练与个性化联邦学习框架结合后,相比联邦对抗性训练,内部攻击相对鲁棒性提升了60%,且在有限系统资源下仍表现优异。