Even though recent years have seen many attacks exposing severe vulnerabilities in Federated Learning (FL), a holistic understanding of what enables these attacks and how they can be mitigated effectively is still lacking. In this work, we demystify the inner workings of existing (targeted) attacks. We provide new insights into why these attacks are possible and why a definitive solution to FL robustness is challenging. We show that the need for ML algorithms to memorize tail data has significant implications for FL integrity. This phenomenon has largely been studied in the context of privacy; our analysis sheds light on its implications for ML integrity. We show that certain classes of severe attacks can be mitigated effectively by enforcing constraints such as norm bounds on clients' updates. We investigate how to efficiently incorporate these constraints into secure FL protocols in the single-server setting. Based on this, we propose RoFL, a new secure FL system that extends secure aggregation with privacy-preserving input validation. Specifically, RoFL can enforce constraints such as $L_2$ and $L_\infty$ bounds on high-dimensional encrypted model updates.
翻译:尽管近年来出现了许多攻击暴露了联邦学习中的严重漏洞,但对于这些攻击的成因及其有效缓解手段仍缺乏整体性理解。本文揭示了现有(定向)攻击的内部机制,为这些攻击为何能够实现以及联邦学习鲁棒性的终极解决方案为何充满挑战提供了新见解。研究表明,机器学习算法对尾数据记忆的需求对联邦学习的完整性具有重大影响——这一现象此前主要从隐私角度被研究,而本文分析阐明了其对机器学习完整性的启示。我们证明,通过强制实施类似客户端更新范数约束等限制条件,可以有效缓解某些严重攻击类型。针对单服务器场景,我们探讨了如何高效地将这些约束融入安全联邦学习协议。基于此,我们提出RoFL——一种新型安全联邦学习系统,该系统通过隐私保护输入验证扩展了安全聚合功能。具体而言,RoFL能够对高维加密模型更新施加$L_2$和$L_\infty$范数等约束。