MPU: Towards Secure and Privacy-Preserving Knowledge Unlearning for Large Language Models

Machine unlearning for large language models often faces a privacy dilemma in which strict constraints prohibit sharing either the server's parameters or the client's forget set. To address this dual non-disclosure constraint, we propose MPU, an algorithm-agnostic privacy-preserving Multiple Perturbed Copies Unlearning framework that primarily introduces two server-side modules: Pre-Process for randomized copy generation and Post-Process for update aggregation. In Pre-Process, the server distributes multiple perturbed and reparameterized model instances, allowing the client to execute unlearning locally on its private forget set without accessing the server's exact original parameters. After local unlearning, the server performs Post-Process by inverting the reparameterization and aggregating updates with a harmonic denoising procedure to alleviate the impact of perturbation. Experiments with seven unlearning algorithms show that MPU achieves comparable unlearning performance to noise-free baselines, with most algorithms' average degradation well below 1% up to 10% noise, and can even outperform the noise-free baseline for some algorithms under 1% noise. Code is available at https://github.com/Tristan0318/MPU.

翻译：大型语言模型的机器遗忘常面临隐私困境：严格的约束条件禁止共享服务器参数或客户端的遗忘集。为应对这种双重非披露约束，我们提出MPU——一种算法无关的隐私保护多扰动副本遗忘框架，主要引入两个服务端模块：用于随机副本生成的预处理模块和用于更新聚合的后处理模块。在预处理阶段，服务端分发多个经过扰动和重参数化的模型实例，使客户端能够在私有遗忘集上本地执行遗忘操作，而无需访问服务端精确的原始参数。本地遗忘完成后，服务端通过反转重参数化并采用谐波去噪过程聚合更新，以缓解扰动影响。针对七种遗忘算法的实验表明，MPU能达到与无噪声基准相当的遗忘性能——在噪声高达10%时，大多数算法的平均性能退化低于1%；在1%噪声条件下，部分算法甚至能超越无噪声基准。代码已开源至https://github.com/Tristan0318/MPU。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CMU博士论文】大型语言模型的隐性特性

专知会员服务

15+阅读 · 2025年10月18日

综述：面向移动端大语言模型的隐私与安全

专知会员服务

19+阅读 · 2025年9月7日

【新书】大规模语言模型的隐私与安全，

专知会员服务

29+阅读 · 2024年12月4日

大语言模型中的提示隐私保护

专知会员服务

24+阅读 · 2024年7月24日