Cross-silo federated learning (FL) allows data owners to train accurate machine learning models by benefiting from each others private datasets. Unfortunately, the model accuracy benefits of collaboration are often undermined by privacy defenses. Therefore, to incentivize client participation in privacy-sensitive domains, a FL protocol should strike a delicate balance between privacy guarantees and end-model accuracy. In this paper, we study the question of when and how a server could design a FL protocol provably beneficial for all participants. First, we provide necessary and sufficient conditions for the existence of mutually beneficial protocols in the context of mean estimation and convex stochastic optimization. We also derive protocols that maximize the total clients' utility, given symmetric privacy preferences. Finally, we design protocols maximizing end-model accuracy and demonstrate their benefits in synthetic experiments.
翻译:跨孤岛联邦学习(FL)使数据所有者能够通过利用彼此的私有数据集来训练精确的机器学习模型。然而,协作带来的模型精度提升往往因隐私防御措施而受损。因此,为激励客户参与隐私敏感领域,联邦学习协议需要在隐私保障与最终模型精度之间达到精细平衡。本文研究了服务器如何以及在何种条件下能设计出对所有参与者具有可证明效益的联邦学习协议。首先,我们给出了在均值估计与凸随机优化背景下存在互利协议的必要充分条件。同时,在对称的隐私偏好设定下,我们推导出能最大化客户总效用的协议。最后,我们设计了可最大化最终模型精度的协议,并通过合成实验验证了其优势。