Large language models (LLMs), such as ChatGPT, have simplified text generation tasks, yet their inherent privacy risks are increasingly garnering attention. While differential privacy techniques have been successfully applied to text classification tasks, the resultant semantic bias makes them unsuitable for text generation. Homomorphic encryption inference methods have also been introduced, however, the significant computational and communication costs limit their viability. Furthermore, closed-source, black-box models such as GPT-4 withhold their architecture, thwarting certain privacy-enhancing strategies such as splitting inference into local and remote and then adding noise when communicating. To overcome these challenges, we introduce PrivInfer, the first privacy-preserving inference framework for black-box LLMs in text generation. Inspired by human writing, PrivInfer employs differential privacy methods to generate perturbed prompts for remote LLMs inference and extracts the meaningful response from the remote perturbed results. We also introduce RANTEXT, a differential privacy scheme specifically for LLMs that leverages random adjacency in text perturbations. Experimental results indicate that PrivInfer is comparable to GPT-4 in terms of text generation quality while protecting privacy, and RANTEXT provides enhanced privacy protection against three types of differential privacy attacks, including our newly introduced GPT inference attack, compared to baseline methods.
翻译:大型语言模型(如ChatGPT)简化了文本生成任务,但其固有的隐私风险日益引发关注。尽管差分隐私技术已成功应用于文本分类任务,但由此产生的语义偏差使其不适用于文本生成场景。同态加密推理方法虽已被引入,但高昂的计算与通信成本限制了其实用性。此外,GPT-4等闭源黑盒模型隐藏其架构,阻碍了诸如将推理分割为本地与远程部分并在通信时添加噪声等隐私增强策略的实施。为应对这些挑战,我们提出PrivInfer——首个面向黑盒大语言模型文本生成的隐私保护推理框架。受人类写作启发,PrivInfer采用差分隐私方法为远程大语言模型推理生成扰动提示,并从远程扰动结果中提取有效响应。我们还提出RANTEXT——一种专为大语言模型设计的差分隐私方案,该方案利用文本扰动中的随机邻接特性。实验结果表明,PrivInfer在保护隐私的同时,其文本生成质量可与GPT-4相媲美;相较于基线方法,RANTEXT针对三种差分隐私攻击(包括我们新提出的GPT推理攻击)提供了更强的隐私保护。