Large Language Models (LLMs), renowned for their superior proficiency in language comprehension and generation, stimulate a vibrant ecosystem of applications around them. However, their extensive assimilation into various services introduces significant security risks. This study deconstructs the complexities and implications of prompt injection attacks on actual LLM-integrated applications. Initially, we conduct an exploratory analysis on ten commercial applications, highlighting the constraints of current attack strategies in practice. Prompted by these limitations, we subsequently formulate HouYi, a novel black-box prompt injection attack technique, which draws inspiration from traditional web injection attacks. HouYi is compartmentalized into three crucial elements: a seamlessly-incorporated pre-constructed prompt, an injection prompt inducing context partition, and a malicious payload designed to fulfill the attack objectives. Leveraging HouYi, we unveil previously unknown and severe attack outcomes, such as unrestricted arbitrary LLM usage and uncomplicated application prompt theft. We deploy HouYi on 36 actual LLM-integrated applications and discern 31 applications susceptible to prompt injection. 10 vendors have validated our discoveries, including Notion, which has the potential to impact millions of users. Our investigation illuminates both the possible risks of prompt injection attacks and the possible tactics for mitigation.
翻译:大型语言模型(LLM)以其在语言理解和生成方面的卓越能力而闻名,围绕它们催生了一个充满活力的应用生态系统。然而,它们被广泛集成到各种服务中,也带来了重大的安全风险。本研究剖析了针对实际集成LLM的应用程序进行提示注入攻击的复杂性及其影响。首先,我们对十个商业应用进行了探索性分析,揭示了当前攻击策略在实际应用中的局限性。受这些限制的启发,我们随后提出了HouYi——一种新颖的黑盒提示注入攻击技术,该技术借鉴了传统Web注入攻击的思路。HouYi被划分为三个关键要素:一个无缝集成的预先构建的提示、一个诱导上下文划分的注入提示,以及一个旨在实现攻击目标的恶意载荷。利用HouYi,我们揭示了先前未知且严重的攻击后果,例如不受限制地任意使用LLM以及轻松窃取应用提示。我们将HouYi部署在36个实际集成LLM的应用上,并发现其中31个应用易受提示注入攻击。已有10家供应商验证了我们的发现,其中包括Notion,该漏洞可能影响数百万用户。我们的研究不仅揭示了提示注入攻击的潜在风险,也指明了可能的缓解策略。