Federated learning (FL) has recently gained significant momentum due to its potential to leverage large-scale distributed user data while preserving user privacy. However, the typical paradigm of FL faces challenges of both privacy and robustness: the transmitted model updates can potentially leak sensitive user information, and the lack of central control of the local training process leaves the global model susceptible to malicious manipulations on model updates. Current solutions attempting to address both problems under the one-server FL setting fall short in the following aspects: 1) designed for simple validity checks that are insufficient against advanced attacks (e.g., checking norm of individual update); and 2) partial privacy leakage for more complicated robust aggregation algorithms (e.g., distances between model updates are leaked for multi-Krum). In this work, we formalize a novel security notion of aggregated privacy that characterizes the minimum amount of user information, in the form of some aggregated statistics of users' updates, that is necessary to be revealed to accomplish more advanced robust aggregation. We develop a general framework PriRoAgg, utilizing Lagrange coded computing and distributed zero-knowledge proof, to execute a wide range of robust aggregation algorithms while satisfying aggregated privacy. As concrete instantiations of PriRoAgg, we construct two secure and robust protocols based on state-of-the-art robust algorithms, for which we provide full theoretical analyses on security and complexity. Extensive experiments are conducted for these protocols, demonstrating their robustness against various model integrity attacks, and their efficiency advantages over baselines.
翻译:联邦学习(Federated Learning, FL)因其能够利用大规模分布式用户数据同时保护用户隐私的潜力,近来获得了显著的发展势头。然而,典型的联邦学习范式面临着隐私和鲁棒性两方面的挑战:传输的模型更新可能泄露敏感的用户信息,并且由于缺乏对本地训练过程的集中控制,全局模型容易受到对模型更新的恶意操纵。当前试图在单服务器联邦学习设置下同时解决这两个问题的方案在以下方面存在不足:1) 设计用于简单的有效性检查,不足以抵御高级攻击(例如,检查单个更新的范数);2) 对于更复杂的鲁棒聚合算法存在部分隐私泄露(例如,multi-Krum算法会泄露模型更新之间的距离)。在本工作中,我们形式化了一个新颖的聚合隐私安全概念,该概念描述了为实现更高级的鲁棒聚合而必须向服务器揭示的用户信息的最小量,这些信息以用户更新的某些聚合统计量的形式存在。我们开发了一个通用框架 PriRoAgg,利用拉格朗日编码计算和分布式零知识证明,来执行广泛的鲁棒聚合算法,同时满足聚合隐私要求。作为 PriRoAgg 的具体实例化,我们基于最先进的鲁棒算法构建了两个安全且鲁棒的协议,并对其安全性和复杂性提供了完整的理论分析。对这些协议进行了广泛的实验,证明了它们能够抵御各种模型完整性攻击,并且相较于基线方法具有效率优势。