Federated Learning (FL) was initially proposed as a privacy-preserving machine learning paradigm. However, FL has been shown to be susceptible to a series of privacy attacks. Recently, there has been concern about the Source Inference Attack (SIA), where an honest-but-curious central server attempts to identify exactly which client owns a given data point which was used in the training phase. Alarmingly, standard gradient obfuscation techniques with Differential Privacy have been shown to be ineffective against SIAs, at least without severely diminishing the accuracy. In this work, we propose a defense against SIAs within the widely studied shuffle model of FL, where an honest shuffler acts as an intermediary between the clients and the server. First, we demonstrate that standard naive shuffling alone is insufficient to prevent SIAs. To effectively defend against SIAs, shuffling needs to be applied at a more granular level; we propose a novel combination of parameter-level shuffling with the residue number system (RNS). Our approach provides robust protection against SIAs without affecting the accuracy of the joint model and can be seamlessly integrated into other privacy protection mechanisms. We conduct experiments on a series of models and datasets, confirming that standard shuffling approaches fail to prevent SIAs and that, in contrast, our proposed method reduce the attack's accuracy to the level of random guessing.
翻译:联邦学习(Federated Learning, FL)最初作为一种保护隐私的机器学习范式被提出。然而,研究表明FL容易受到一系列隐私攻击。近期,源推断攻击(Source Inference Attack, SIA)引起了关注,在这种攻击中,一个诚实但好奇的中央服务器试图精确识别在训练阶段使用的某个给定数据点归属于哪个客户端。令人担忧的是,标准的差分隐私梯度混淆技术已被证明对SIA无效,至少在不过度牺牲模型准确性的情况下是如此。在本工作中,我们针对广泛研究的FL混洗模型提出了一种防御SIA的方法,该模型中一个诚实的混洗器作为客户端与服务器之间的中介。首先,我们证明了标准的基础混洗方法本身不足以防止SIA。为了有效防御SIA,混洗需要在更细粒度上应用;我们提出了一种新颖的参数级混洗与剩余数系统(Residue Number System, RNS)相结合的方法。我们的方法在不影响联合模型准确性的前提下,为抵抗SIA提供了鲁棒的保护,并且可以无缝集成到其他隐私保护机制中。我们在一系列模型和数据集上进行了实验,证实了标准的混洗方法无法阻止SIA,而相比之下,我们提出的方法能将攻击准确率降低至随机猜测的水平。