While recent research increasingly showcases the remarkable capabilities of Large Language Models (LLMs), it is equally crucial to examine their associated risks. Among these, privacy and security vulnerabilities are particularly concerning, posing significant ethical and legal challenges. At the heart of these vulnerabilities stands memorization, which refers to a model's tendency to store and reproduce phrases from its training data. This phenomenon has been shown to be a fundamental source to various privacy and security attacks against LLMs. In this paper, we provide a taxonomy of the literature on LLM memorization, exploring it across three dimensions: granularity, retrievability, and desirability. Next, we discuss the metrics and methods used to quantify memorization, followed by an analysis of the causes and factors that contribute to memorization phenomenon. We then explore strategies that are used so far to mitigate the undesirable aspects of this phenomenon. We conclude our survey by identifying potential research topics for the near future, including methods to balance privacy and performance, and the analysis of memorization in specific LLM contexts such as conversational agents, retrieval-augmented generation, and diffusion language models. Given the rapid research pace in this field, we also maintain a dedicated repository of the references discussed in this survey which will be regularly updated to reflect the latest developments.
翻译:尽管近期研究日益展示出大型语言模型(LLMs)的卓越能力,审视其相关风险同样至关重要。其中,隐私与安全漏洞尤为令人担忧,构成了重大的伦理与法律挑战。这些漏洞的核心在于记忆现象,即模型存储并复现训练数据中特定语句的倾向。研究表明,该现象是引发针对LLMs各类隐私与安全攻击的根本源头。本文从粒度、可检索性与合意性三个维度出发,对LLM记忆研究文献进行了系统分类。继而,我们讨论了量化记忆现象的评估指标与方法,并分析了导致该现象产生的原因与影响因素。随后,我们探讨了目前用于缓解该现象不良影响的相关策略。最后,本文通过指出近期潜在研究方向作为总结,包括平衡隐私与性能的方法,以及在对话智能体、检索增强生成、扩散语言模型等特定LLM场景中的记忆现象分析。鉴于该领域研究进展迅速,我们同时维护了本文所述参考文献的专属资源库,并将持续更新以反映最新研究动态。