Federated Learning (FL) has been proposed as a privacy-preserving solution for distributed machine learning, particularly in heterogeneous FL settings where clients have varying computational capabilities and thus train models with different complexities compared to the server's model. However, FL is not without vulnerabilities: recent studies have shown that it is susceptible to membership inference attacks (MIA), which can compromise the privacy of client data. In this paper, we examine the intersection of these two aspects, heterogeneous FL and its privacy vulnerabilities, by focusing on the role of client model integration, the process through which the server integrates parameters from clients' smaller models into its larger model. To better understand this process, we first propose a taxonomy that categorizes existing heterogeneous FL methods and enables the design of seven novel heterogeneous FL model integration strategies. Using CIFAR-10, CIFAR-100, and FEMNIST vision datasets, we evaluate the privacy and accuracy trade-offs of these approaches under three types of MIAs. Our findings reveal significant differences in privacy leakage and performance depending on the integration method. Notably, introducing randomness in the model integration process enhances client privacy while maintaining competitive accuracy for both the clients and the server. This work provides quantitative light on the privacy-accuracy implications client model integration in heterogeneous FL settings, paving the way towards more secure and efficient FL systems.
翻译:联邦学习(FL)被提出作为一种保护隐私的分布式机器学习解决方案,尤其在异构FL环境中,客户端的计算能力各不相同,因此训练的模型复杂度与服务器模型相比存在差异。然而,FL并非没有漏洞:近期研究表明,它容易受到成员推理攻击(MIA)的影响,这可能危及客户端数据的隐私。本文通过聚焦客户端模型集成过程——即服务器将来自客户端较小模型的参数集成到其较大模型中的过程,探讨了异构FL与其隐私漏洞这两个方面的交叉点。为了更好地理解这一过程,我们首先提出了一种分类法,对现有异构FL方法进行分类,并据此设计了七种新颖的异构FL模型集成策略。利用CIFAR-10、CIFAR-100和FEMNIST视觉数据集,我们在三种类型的MIA下评估了这些方法在隐私与准确性之间的权衡。我们的研究结果表明,根据集成方法的不同,隐私泄露和性能存在显著差异。值得注意的是,在模型集成过程中引入随机性可以增强客户端隐私,同时保持客户端和服务器具有竞争力的准确性。这项工作量化揭示了异构FL环境中客户端模型集成对隐私与准确性的影响,为构建更安全、高效的FL系统铺平了道路。