The advancement of large language models (LLMs) has significantly enhanced the ability to effectively tackle various downstream NLP tasks and unify these tasks into generative pipelines. On the one hand, powerful language models, trained on massive textual data, have brought unparalleled accessibility and usability for both models and users. On the other hand, unrestricted access to these models can also introduce potential malicious and unintentional privacy risks. Despite ongoing efforts to address the safety and privacy concerns associated with LLMs, the problem remains unresolved. In this paper, we provide a comprehensive analysis of the current privacy attacks targeting LLMs and categorize them according to the adversary's assumed capabilities to shed light on the potential vulnerabilities present in LLMs. Then, we present a detailed overview of prominent defense strategies that have been developed to counter these privacy attacks. Beyond existing works, we identify upcoming privacy concerns as LLMs evolve. Lastly, we point out several potential avenues for future exploration.
翻译:大型语言模型(LLMs)的发展显著提升了有效处理各种下游自然语言处理任务的能力,并将这些任务统一到生成式流程中。一方面,基于海量文本数据训练的强语言模型为模型和用户带来了前所未有的可访问性和可用性;另一方面,对这些模型的非受限访问也可能引发潜在的恶意或无意的隐私风险。尽管学界持续致力于解决与LLMs相关的安全与隐私问题,该问题仍未得到完全解决。本文全面分析了当前针对LLMs的隐私攻击,并根据攻击者假设具备的能力对其进行分类,以揭示LLMs中可能存在的潜在漏洞。随后,我们详细概述了为应对这些隐私攻击而开发的主要防御策略。在现有研究基础上,我们进一步指出了随着LLMs演进即将出现的隐私问题。最后,本文提出了若干未来探索的潜在方向。