The advancement of large language models (LLMs) has significantly enhanced the ability to effectively tackle various downstream NLP tasks and unify these tasks into generative pipelines. On the one hand, powerful language models, trained on massive textual data, have brought unparalleled accessibility and usability for both models and users. On the other hand, unrestricted access to these models can also introduce potential malicious and unintentional privacy risks. Despite ongoing efforts to address the safety and privacy concerns associated with LLMs, the problem remains unresolved. In this paper, we provide a comprehensive analysis of the current privacy attacks targeting LLMs and categorize them according to the adversary's assumed capabilities to shed light on the potential vulnerabilities present in LLMs. Then, we present a detailed overview of prominent defense strategies that have been developed to counter these privacy attacks. Beyond existing works, we identify upcoming privacy concerns as LLMs evolve. Lastly, we point out several potential avenues for future exploration.
翻译:大型语言模型(LLMs)的进步显著提升了有效处理各类下游自然语言处理任务并将其统一为生成式流程的能力。一方面,基于海量文本数据训练的强大人语言模型为模型和用户带来了前所未有的可访问性与易用性。另一方面,对这些模型的非受限访问也可能引入潜在的恶意或无意隐私风险。尽管为解决LLMs相关的安全与隐私问题已做出持续努力,但该问题仍未得到解决。本文对当前针对LLMs的隐私攻击进行了全面分析,并根据攻击者假设的能力对其进行分类,以揭示LLMs中存在的潜在脆弱性。随后,我们详细概述了为应对这些隐私攻击而开发的主要防御策略。在现有工作基础上,我们识别了随着LLMs演进即将出现的隐私问题。最后,我们指出了未来探索的若干潜在方向。