Federated learning (FL) is a machine learning paradigm in which distributed local nodes collaboratively train a central model without sharing individually held private data. Existing FL methods either iteratively share local model parameters or deploy co-distillation. However, the former is highly susceptible to private data leakage, and the latter design relies on the prerequisites of task-relevant real data. Instead, we propose a data-free FL framework based on local-to-central collaborative distillation with direct input and output space exploitation. Our design eliminates any requirement of recursive local parameter exchange or auxiliary task-relevant data to transfer knowledge, thereby giving direct privacy control to local users. In particular, to cope with the inherent data heterogeneity across locals, our technique learns to distill input on which each local model produces consensual yet unique results to represent each expertise. Our proposed FL framework achieves notable privacy-utility trade-offs with extensive experiments on image classification and segmentation tasks under various real-world heterogeneous federated learning settings on both natural and medical images.
翻译:联邦学习是一种机器学习范式,其中分布式本地节点在无需共享各自私有数据的前提下协同训练一个中心模型。现有联邦学习方法要么迭代式共享本地模型参数,要么采用协同蒸馏技术。然而,前者极易遭受私有数据泄露,后者的设计依赖于与任务相关的真实数据作为前提条件。为此,我们提出一种基于本地到中心协同蒸馏的无数据联邦学习框架,该框架直接利用输入与输出空间进行知识迁移。我们的设计消除了递归式本地参数交换或借助辅助任务相关数据进行知识传递的任一需求,从而赋予本地用户直接的隐私控制权。特别地,为应对各本地节点固有的数据异质性,我们提出的技术能够学习蒸馏出使每个本地模型产生一致且独特结果的输入表征,以反映各自领域的专长。通过在自然图像与医学图像上针对多种真实世界异构联邦学习场景开展图像分类与分割任务的大量实验,我们提出的联邦学习框架在隐私与效用之间取得了显著权衡。