Large Language Models (LLMs) have emerged as dominant tools for various tasks, particularly when tailored for a specific target by prompt tuning. Nevertheless, concerns surrounding data privacy present obstacles due to the tuned prompts' dependency on sensitive private information. A practical solution is to host a local LLM and optimize a soft prompt privately using data. Yet, hosting a local model becomes problematic when model ownership is protected. Alternative methods, like sending data to the model's provider for training, intensify these privacy issues facing an untrusted provider. In this paper, we present a novel solution called Differentially-Private Offsite Prompt Tuning (DP-OPT) to address this challenge. Our approach involves tuning a discrete prompt on the client side and then applying it to the desired cloud models. We demonstrate that prompts suggested by LLMs themselves can be transferred without compromising performance significantly. To ensure that the prompts do not leak private information, we introduce the first private prompt generation mechanism, by a differentially-private (DP) ensemble of in-context learning with private demonstrations. With DP-OPT, generating privacy-preserving prompts by Vicuna-7b can yield competitive performance compared to non-private in-context learning on GPT3.5 or local private prompt tuning. Codes are available at https://github.com/VITA-Group/DP-OPT .
翻译:大型语言模型(LLMs)已成为各类任务中的主导工具,尤其在通过提示调优针对特定目标进行定制时。然而,数据隐私问题带来了阻碍,因为调优后的提示依赖于敏感的私有信息。一种实用的解决方案是托管一个本地大型语言模型,并使用数据私下优化软提示。但当模型所有权受到保护时,托管本地模型会变得困难。替代方法,如将数据发送给模型提供商进行训练,会加剧面对不可信提供商时的隐私问题。在本文中,我们提出了一种名为差分隐私离线提示调优(DP-OPT)的新颖解决方案来应对这一挑战。我们的方法涉及在客户端侧调优离散提示,然后将其应用于所需的云端模型。我们证明,大型语言模型自身推荐的提示可以在不显著影响性能的情况下迁移。为确保提示不会泄露私有信息,我们引入了首个隐私保护提示生成机制,通过基于私有样本的上下文学习进行差分隐私(DP)集成。借助DP-OPT,由Vicuna-7b生成的隐私保护提示,其性能可与GPT3.5上的非私有上下文学习或本地私有提示调优相媲美。代码已开源在 https://github.com/VITA-Group/DP-OPT。