Federated learning allows for clients in a distributed system to jointly train a machine learning model. However, clients' models are vulnerable to attacks during the training and testing phases. In this paper, we address the issue of adversarial clients performing "internal evasion attacks": crafting evasion attacks at test time to deceive other clients. For example, adversaries may aim to deceive spam filters and recommendation systems trained with federated learning for monetary gain. The adversarial clients have extensive information about the victim model in a federated learning setting, as weight information is shared amongst clients. We are the first to characterize the transferability of such internal evasion attacks for different learning methods and analyze the trade-off between model accuracy and robustness depending on the degree of similarities in client data. We show that adversarial training defenses in the federated learning setting only display limited improvements against internal attacks. However, combining adversarial training with personalized federated learning frameworks increases relative internal attack robustness by 60% compared to federated adversarial training and performs well under limited system resources.
翻译:联邦学习使得分布式系统中的客户端能够共同训练机器学习模型。然而,在训练和测试阶段,客户端模型容易受到攻击。本文聚焦于恶意客户端实施"内部规避攻击"的问题:在测试阶段精心构造规避攻击以欺骗其他客户端。例如,攻击者可能试图欺骗基于联邦学习训练的垃圾邮件过滤器和推荐系统以获取经济利益。在联邦学习环境中,由于权重信息在客户端之间共享,恶意客户端能够获取受害者模型的详尽信息。我们首次揭示了不同学习方法下此类内部规避攻击的可迁移性特征,并分析了在客户端数据相似度不同时模型准确率与鲁棒性之间的权衡关系。研究表明,联邦学习环境中的对抗训练防御手段对内部攻击仅能提供有限的改进效果。然而,将对抗训练与个性化联邦学习框架相结合,相较于联邦对抗训练可使内部攻击鲁棒性提升60%,且在有限系统资源下仍能保持良好性能。