Secure Aggregation in Federated Learning is not Private: Leaking User Data at Large Scale through Model Modification

Security and privacy are important concerns in machine learning. End user devices often contain a wealth of data and this information is sensitive and should not be shared with servers or enterprises. As a result, federated learning was introduced to enable machine learning over large decentralized datasets while promising privacy by eliminating the need for data sharing. However, prior work has shown that shared gradients often contain private information and attackers can gain knowledge either through malicious modification of the architecture and parameters or by using optimization to approximate user data from the shared gradients. Despite this, most attacks have so far been limited in scale of number of clients, especially failing when client gradients are aggregated together using secure model aggregation. The attacks that still function are strongly limited in the number of clients attacked, amount of training samples they leak, or number of iterations they take to be trained. In this work, we introduce MANDRAKE, an attack that overcomes previous limitations to directly leak large amounts of client data even under secure aggregation across large numbers of clients. Furthermore, we break the anonymity of aggregation as the leaked data is identifiable and directly tied back to the clients they come from. We show that by sending clients customized convolutional parameters, the weight gradients of data points between clients will remain separate through aggregation. With an aggregation across many clients, prior work could only leak less than 1% of images. With the same number of non-zero parameters, and using only a single training iteration, MANDRAKE leaks 70-80% of data samples.

翻译：安全与隐私是机器学习中的重要关注点。终端用户设备通常包含大量数据，这些信息敏感且不应与服务器或企业共享。因此，联邦学习被提出以实现大规模去中心化数据集上的机器学习，并通过消除数据共享需求来承诺隐私保护。然而，先前研究表明，共享梯度常包含隐私信息，攻击者可通过恶意修改架构与参数，或利用优化方法从共享梯度中近似用户数据来获取信息。尽管如此，现有攻击大多受限于客户端规模，尤其在客户端梯度通过安全模型聚合时失效。仍有效的攻击在攻击客户端数量、泄露训练样本量或训练迭代次数方面存在严重限制。本文提出MANDRAKE攻击，该攻击克服了先前限制，即使在大规模客户端安全聚合场景下也能直接泄露大量客户端数据。此外，我们打破了聚合的匿名性——泄露的数据可被识别并直接追溯至来源客户端。我们证明，通过向客户端发送定制化的卷积参数，不同客户端数据点的权重梯度在聚合后仍保持分离。在跨多个客户端的聚合场景中，先前工作仅能泄露不足1%的图像。而在相同数量的非零参数下，仅需单次训练迭代，MANDRAKE即可泄露70-80%的数据样本。