Federated Learning with Model Distillation (FedMD) is a nascent collaborative learning paradigm, where only output logits of public datasets are transmitted as distilled knowledge, instead of passing on private model parameters that are susceptible to gradient inversion attacks, a known privacy risk in federated learning. In this paper, we found that even though sharing output logits of public datasets is safer than directly sharing gradients, there still exists a substantial risk of data exposure caused by carefully designed malicious attacks. Our study shows that a malicious server can inject a PLI (Paired-Logits Inversion) attack against FedMD and its variants by training an inversion neural network that exploits the confidence gap between the server and client models. Experiments on multiple facial recognition datasets validate that under FedMD-like schemes, by using paired server-client logits of public datasets only, the malicious server is able to reconstruct private images on all tested benchmarks with a high success rate.
翻译:联邦学习与模型蒸馏(FedMD)是一种新兴的协作学习范式,其中仅传输公共数据集的输出对数作为蒸馏知识,而非传递易受梯度反转攻击(联邦学习中已知的隐私风险)影响的私有模型参数。本文发现,尽管共享公共数据集的输出对数比直接共享梯度更安全,但精心设计的恶意攻击仍会带来显著的数据泄露风险。研究表明,恶意服务器可利用服务器与客户端模型之间的置信度差距,训练一个反转神经网络,对FedMD及其变体实施PLI(配对对数反转)攻击。在多个面部识别数据集上的实验验证,在类似FedMD的方案中,仅通过使用公共数据集的配对服务器-客户端对数,恶意服务器便能在所有测试基准上以高成功率重建私有图像。