Large language models (LLMs), such as ChatGPT, have simplified text generation tasks, yet their inherent privacy risks are increasingly garnering attention. While differential privacy techniques have been successfully applied to text classification tasks, the resultant semantic bias makes them unsuitable for text generation. Homomorphic encryption inference methods have also been introduced. However, the significant computational and communication costs limit their viability. Furthermore, closed-source, black-box models such as GPT-4 withhold their architecture, thwarting certain privacy-enhancing strategies such as splitting inference into local and remote and then adding noise when communicating. To overcome these challenges, we introduce PrivInfer, the first practical privacy-preserving inference framework for black-box LLMs in text generation. PrivInfer employs differential privacy methods to generate perturbed prompts for remote LLMs inference and extracts the meaningful response from the remote perturbed results. We also introduce RANTEXT, a differential privacy mechanism within the perturbation module of PrivInfer specifically for LLMs that leverages random adjacency in text perturbations. Experimental results indicate that PrivInfer is comparable to GPT-4 in terms of text generation quality while protecting privacy, and RANTEXT provides enhanced privacy protection against three types of differential privacy attacks, including our newly introduced GPT inference attack, compared to baseline methods.
翻译:大型语言模型(如ChatGPT)简化了文本生成任务,但其固有的隐私风险日益受到关注。尽管差分隐私技术已成功应用于文本分类任务,但由此产生的语义偏差使其不适用于文本生成。同态加密推理方法也已被引入,然而其巨大的计算和通信成本限制了可行性。此外,GPT-4等闭源黑盒模型隐藏其架构,阻碍了某些隐私增强策略(如将推理分割为本地和远程部分,并在通信时添加噪声)。为克服这些挑战,我们提出PrivInfer——首个用于黑盒大语言模型文本生成场景的实用隐私保护推理框架。PrivInfer采用差分隐私方法为远程LLM推理生成扰动提示,并从远程扰动结果中提取有意义的响应。我们还引入RANTEXT——PrivInfer扰动模块中专为LLM设计的差分隐私机制,该机制利用文本扰动中的随机邻接特性。实验结果表明,PrivInfer在保护隐私的同时,其文本生成质量可与GPT-4媲美;而RANTEXT针对三种差分隐私攻击(包括我们新提出的GPT推理攻击)提供的隐私保护效果优于基线方法。