FLTrojan: Privacy Leakage Attacks against Federated Language Models Through Selective Weight Tampering

Federated learning (FL) has become a key component in various language modeling applications such as machine translation, next-word prediction, and medical record analysis. These applications are trained on datasets from many FL participants that often include privacy-sensitive data, such as healthcare records, phone/credit card numbers, login credentials, etc. Although FL enables computation without necessitating clients to share their raw data, determining the extent of privacy leakage in federated language models is challenging and not straightforward. Moreover, existing attacks aim to extract data regardless of how sensitive or naive it is. To fill this research gap, we introduce two novel findings with regard to leaking privacy-sensitive user data from federated large language models. Firstly, we make a key observation that model snapshots from the intermediate rounds in FL can cause greater privacy leakage than the final trained model. Secondly, we identify that privacy leakage can be aggravated by tampering with a model's selective weights that are specifically responsible for memorizing the sensitive training data. We show how a malicious client can leak the privacy-sensitive data of some other users in FL even without any cooperation from the server. Our best-performing method improves the membership inference recall by 29% and achieves up to 71% private data reconstruction, evidently outperforming existing attacks with stronger assumptions of adversary capabilities.

翻译：联邦学习（FL）已成为机器翻译、下一词预测和医疗记录分析等多种语言建模应用的关键组成部分。这些应用基于来自众多FL参与者的数据集进行训练，这些数据通常包含隐私敏感信息，如医疗记录、电话/信用卡号码、登录凭证等。尽管FL使得计算无需客户端共享原始数据，但评估联邦语言模型中隐私泄露的程度具有挑战性且并不直观。此外，现有攻击方法旨在提取数据，无论其敏感程度或普通性如何。为填补这一研究空白，我们针对从联邦大语言模型中泄露隐私敏感用户数据提出了两项新发现。首先，我们观察到FL中间轮次的模型快照可能比最终训练模型造成更严重的隐私泄露。其次，我们发现通过篡改模型中专门负责记忆敏感训练数据的选择性权重，可以加剧隐私泄露。我们展示了恶意客户端如何在无需服务器任何配合的情况下，泄露FL中其他用户的隐私敏感数据。我们性能最佳的方法将成员推断召回率提升了29%，并实现了高达71%的私有数据重建，明显优于那些需要更强攻击者能力假设的现有攻击方法。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/