Federated learning (FL) aims at keeping client data local to preserve privacy. Instead of gathering the data itself, the server only collects aggregated gradient updates from clients. Following the popularity of FL, there has been considerable amount of work, revealing the vulnerability of FL approaches by reconstructing the input data from gradient updates. Yet, most existing works assume an FL setting with unrealistically small batch size, and have poor image quality when the batch size is large. Other works modify the neural network architectures or parameters to the point of being suspicious, and thus, can be detected by clients. Moreover, most of them can only reconstruct one sample input from a large batch. To address these limitations, we propose a novel and completely analytical approach, referred to as the maximum knowledge orthogonality reconstruction (MKOR), to reconstruct clients' input data. Our proposed method reconstructs a mathematically proven high quality image from large batches. MKOR only requires the server to send secretly modified parameters to clients and can efficiently and inconspicuously reconstruct the input images from clients' gradient updates. We evaluate MKOR's performance on the MNIST, CIFAR-100, and ImageNet dataset and compare it with the state-of-the-art works. The results show that MKOR outperforms the existing approaches, and draws attention to a pressing need for further research on the privacy protection of FL so that comprehensive defense approaches can be developed.
翻译:联邦学习旨在保持客户端数据的本地化以保护隐私。与直接收集数据本身不同,服务器仅从客户端收集聚合后的梯度更新。随着联邦学习的普及,大量研究工作揭示了联邦学习方法的脆弱性——通过梯度更新重构输入数据。然而,现有大多数工作假设使用不切实际的小批量大小进行联邦学习,当批量增大时图像重构质量显著下降。其他方法则需修改神经网络架构或参数至可疑程度,极易被客户端检测。此外,多数方法仅能从大批量中重构单个样本输入。为解决这些局限,我们提出了一种全新的完全解析方法——最大知识正交重构算法(MKOR),用于重构客户端的输入数据。该方法能够从大批量数据中重构出经数学证明的高质量图像。MKOR仅需服务器向客户端发送经秘密修改的参数,即可高效且隐蔽地从客户端梯度更新中重构输入图像。我们在MNIST、CIFAR-100和ImageNet数据集上评估了MKOR的性能,并与现有最先进研究进行了对比。结果表明,MKOR优于现有方法,同时揭示了联邦学习隐私保护研究的紧迫需求,以推动综合防御方法的发展。