Teach Large Language Models to Forget Privacy

Large Language Models (LLMs) have proven powerful, but the risk of privacy leakage remains a significant concern. Traditional privacy-preserving methods, such as Differential Privacy and Homomorphic Encryption, are inadequate for black-box API-only settings, demanding either model transparency or heavy computational resources. We propose Prompt2Forget (P2F), the first framework designed to tackle the LLM local privacy challenge by teaching LLM to forget. The method involves decomposing full questions into smaller segments, generating fabricated answers, and obfuscating the model's memory of the original input. A benchmark dataset was crafted with questions containing privacy-sensitive information from diverse fields. P2F achieves zero-shot generalization, allowing adaptability across a wide range of use cases without manual adjustments. Experimental results indicate P2F's robust capability to obfuscate LLM's memory, attaining a forgetfulness score of around 90\% without any utility loss. This represents an enhancement of up to 63\% when contrasted with the naive direct instruction technique, highlighting P2F's efficacy in mitigating memory retention of sensitive information within LLMs. Our findings establish the first benchmark in the novel field of the LLM forgetting task, representing a meaningful advancement in privacy preservation in the emerging LLM domain.

翻译：大型语言模型（LLM）已展现出强大的能力，但隐私泄露风险仍是一个重大关切。传统的隐私保护方法，如差分隐私和同态加密，无法适用于仅提供黑盒API的场景，因为这些方法要么要求模型透明，要么需要大量计算资源。我们提出Prompt2Forget（P2F），这是首个通过教LLM遗忘来应对其本地隐私挑战的框架。该方法将完整问题分解为更小的片段，生成虚构答案，并混淆模型对原始输入的存储。我们构建了一个包含来自不同领域的隐私敏感问题的基准数据集。P2F实现了零样本泛化能力，无需人工调整即可适应广泛的用例。实验结果表明，P2F在混淆LLM记忆方面具有稳健能力，能在不损失任何效用的前提下实现约90%的遗忘得分。与简单的直接指令方法相比，这一效果提升了高达63%，凸显了P2F在减轻LLM对敏感信息记忆保持方面的有效性。我们的研究成果为LLM遗忘任务这一新兴领域建立了首个基准，标志着在LLM新兴领域隐私保护方面取得了有意义的进展。

相关内容

大语言模型

关注 66

大语言模型是基于海量文本数据训练的深度学习模型。它不仅能够生成自然语言文本，还能够深入理解文本含义，处理各种自然语言任务，如文本摘要、问答、翻译等。2023年，大语言模型及其在人工智能领域的应用已成为全球科技研究的热点，其在规模上的增长尤为引人注目，参数量已从最初的十几亿跃升到如今的一万亿。参数量的提升使得模型能够更加精细地捕捉人类语言微妙之处，更加深入地理解人类语言的复杂性。在过去的一年里，大语言模型在吸纳新知识、分解复杂任务以及图文对齐等多方面都有显著提升。随着技术的不断成熟，它将不断拓展其应用范围，为人类提供更加智能化和个性化的服务，进一步改善人们的生活和生产方式。

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日