Federated Learning (FL) has emerged as an effective learning paradigm for distributed computation owing to its strong potential in capturing underlying data statistics while preserving data privacy. However, in cases of practical data heterogeneity among FL clients, existing FL frameworks still exhibit deficiency in capturing the overall feature properties of local client data that exhibit disparate distributions. In response, generative adversarial networks (GANs) have recently been exploited in FL to address data heterogeneity since GANs can be integrated for data regeneration without exposing original raw data. Despite some successes, existing GAN-related FL frameworks often incur heavy communication cost and also elicit other privacy concerns, which limit their applications in real scenarios. To this end, this work proposes a novel FL framework that requires only partial GAN model sharing. Named as PS-FedGAN, this new framework enhances the GAN releasing and training mechanism to address heterogeneous data distributions across clients and to strengthen privacy preservation at reduced communication cost, especially over wireless networks. Our analysis demonstrates the convergence and privacy benefits of the proposed PS-FEdGAN framework. Through experimental results based on several well-known benchmark datasets, our proposed PS-FedGAN shows great promise to tackle FL under non-IID client data distributions, while securing data privacy and lowering communication overhead.
翻译:联邦学习(FL)因在保护数据隐私的同时捕捉底层数据统计特性的强大潜力,已成为分布式计算的有效学习范式。然而,当FL客户端存在实际数据异质性时,现有FL框架仍难以捕捉呈现不同分布的本地客户端数据的整体特征属性。为此,生成对抗网络(GAN)近期被引入FL以解决数据异质性问题,因为GAN可在不暴露原始数据的情况下集成用于数据重构。尽管取得一定成功,现有基于GAN的FL框架通常面临通信开销高昂以及引发其他隐私问题的困境,这限制了它们在实际场景中的应用。针对此问题,本文提出一种仅需部分GAN模型共享的新型FL框架。该框架命名为PS-FedGAN,通过优化GAN的发布与训练机制来应对跨客户端的异构数据分布,并在降低通信开销(尤其是无线网络环境)的同时强化隐私保护。理论分析证明了所提PS-FedGAN框架的收敛性与隐私优势。基于多个知名基准数据集的实验结果表明,PS-FedGAN在应对非独立同分布客户端数据分布下的FL任务时,既能保障数据隐私又能降低通信开销,展现出巨大潜力。