Large Language Models (LLMs), such as ChatGPT and Bard, have revolutionized natural language understanding and generation. They possess deep language comprehension, human-like text generation capabilities, contextual awareness, and robust problem-solving skills, making them invaluable in various domains (e.g., search engines, customer support, translation). In the meantime, LLMs have also gained traction in the security community, revealing security vulnerabilities and showcasing their potential in security-related tasks. This paper explores the intersection of LLMs with security and privacy. Specifically, we investigate how LLMs positively impact security and privacy, potential risks and threats associated with their use, and inherent vulnerabilities within LLMs. Through a comprehensive literature review, the paper categorizes the papers into "The Good" (beneficial LLM applications), "The Bad" (offensive applications), and "The Ugly" (vulnerabilities of LLMs and their defenses). We have some interesting findings. For example, LLMs have proven to enhance code security (code vulnerability detection) and data privacy (data confidentiality protection), outperforming traditional methods. However, they can also be harnessed for various attacks (particularly user-level attacks) due to their human-like reasoning abilities. We have identified areas that require further research efforts. For example, Research on model and parameter extraction attacks is limited and often theoretical, hindered by LLM parameter scale and confidentiality. Safe instruction tuning, a recent development, requires more exploration. We hope that our work can shed light on the LLMs' potential to both bolster and jeopardize cybersecurity.
翻译:大型语言模型(如ChatGPT和Bard)已彻底革新了自然语言理解与生成领域。它们具备深度语言理解、类人文本生成能力、上下文感知能力以及强大的问题解决能力,在搜索引擎、客户支持、翻译等多个领域具有重要价值。与此同时,LLM在安全领域也逐渐受到关注,既暴露出安全漏洞,也展示了其在安全相关任务中的潜力。本文探讨了LLM与安全及隐私的交叉领域。具体而言,我们研究了LLM如何对安全与隐私产生积极影响、其使用相关的潜在风险与威胁,以及LLM本身存在的固有脆弱性。通过全面的文献综述,本文将相关研究归类为"善用"(有益的LLM应用)、"恶用"(攻击性应用)和"隐患"(LLM的脆弱性及其防御)。我们获得了一些有趣的发现。例如,LLM已被证明能够增强代码安全(代码漏洞检测)和数据隐私(数据机密性保护),其效果优于传统方法。然而,由于其类人推理能力,它们也可能被用于各种攻击(尤其是用户级攻击)。我们识别出需要进一步研究的领域。例如,针对模型和参数提取攻击的研究有限且多为理论性,受到LLM参数规模及保密性的制约。安全指令微调作为近期发展成果,需要更多探索。我们希望本研究能揭示LLM在增强与威胁网络安全方面的双重潜力。