Large pretrained language models (LLMs) have shown surprising In-Context Learning (ICL) ability. An important application in deploying large language models is to augment LLMs with a private database for some specific task. The main problem with this promising commercial use is that LLMs have been shown to memorize their training data and their prompt data are vulnerable to membership inference attacks (MIA) and prompt leaking attacks. In order to deal with this problem, we treat LLMs as untrusted in privacy and propose a locally differentially private framework of in-context learning(LDP-ICL) in the settings where labels are sensitive. Considering the mechanisms of in-context learning in Transformers by gradient descent, we provide an analysis of the trade-off between privacy and utility in such LDP-ICL for classification. Moreover, we apply LDP-ICL to the discrete distribution estimation problem. In the end, we perform several experiments to demonstrate our analysis results.
翻译:大型预训练语言模型展现出令人惊讶的上下文学习能力。在部署大型语言模型时,一个重要应用是为特定任务增强带有私有数据库的语言模型。这一具有前景的商业用途面临的主要问题是,语言模型已被证明会记忆其训练数据,且其提示数据容易受到成员推理攻击和提示泄露攻击。为解决该问题,我们将语言模型视为不可信的隐私方,并针对标签敏感的场景提出了一种基于本地差分隐私的上下文学习框架。考虑到Transformer中通过梯度下降实现的上下文学习机制,我们分析了此类分类任务中隐私与效用之间的权衡关系。此外,我们将该框架应用于离散分布估计问题。最后,通过多项实验验证了我们的分析结果。