Efficient Secure Aggregation for Privacy-Preserving Federated Machine Learning

Federated learning introduces a novel approach to training machine learning (ML) models on distributed data while preserving user's data privacy. This is done by distributing the model to clients to perform training on their local data and computing the final model at a central server. To prevent any data leakage from the local model updates, various works with focus on secure aggregation for privacy preserving federated learning have been proposed. Despite their merits, most of the existing protocols still incur high communication and computation overhead on the participating entities and might not be optimized to efficiently handle the large update vectors for ML models. In this paper, we present E-seaML, a novel secure aggregation protocol with high communication and computation efficiency. E-seaML only requires one round of communication in the aggregation phase and it is up to 318x and 1224x faster for the user and the server (respectively) as compared to its most efficient counterpart. E-seaML also allows for efficiently verifying the integrity of the final model by allowing the aggregation server to generate a proof of honest aggregation for the participating users. This high efficiency and versatility is achieved by extending (and weakening) the assumption of the existing works on the set of honest parties (i.e., users) to a set of assisting nodes. Therefore, we assume a set of assisting nodes which assist the aggregation server in the aggregation process. We also discuss, given the minimal computation and communication overhead on the assisting nodes, how one could assume a set of rotating users to as assisting nodes in each iteration. We provide the open-sourced implementation of E-seaML for public verifiability and testing.

翻译：联邦学习提供了一种在分布式数据上训练机器学习模型同时保护用户数据隐私的新方法。该方法通过将模型分发给客户端，由其利用本地数据进行训练，然后由中央服务器计算最终模型。为防止本地模型更新导致的数据泄露，已有诸多研究聚焦于面向隐私保护联邦学习的安全聚合协议。尽管这些协议具有优势，但现有大多数方案仍会给参与实体带来较高的通信和计算开销，且可能未针对机器学习模型的大规模更新向量进行优化。本文提出E-seaML，一种兼具高通信与计算效率的新型安全聚合协议。该协议在聚合阶段仅需一轮通信，相较于最高效的同类方案，用户和服务器的计算速度分别提升高达318倍和1224倍。E-seaML还允许聚合服务器为参与用户生成诚实聚合证明，从而高效验证最终模型的完整性。这种高效性和多功能性是通过将现有工作中关于诚实参与方（即用户）的假设扩展（并弱化）为一组辅助节点实现的。因此，我们假定存在一组辅助节点，协助聚合服务器完成聚合过程。同时，考虑到辅助节点的计算与通信开销极低，我们进一步探讨了如何在每轮迭代中选择轮换用户作为辅助节点。我们已开源E-seaML的实现代码，以供公开验证与测试。