Demystifying RCE Vulnerabilities in LLM-Integrated Apps

In recent years, Large Language Models (LLMs) have demonstrated remarkable potential across various downstream tasks. LLM-integrated frameworks, which serve as the essential infrastructure, have given rise to many LLM-integrated web apps. However, some of these frameworks suffer from Remote Code Execution (RCE) vulnerabilities, allowing attackers to execute arbitrary code on apps' servers remotely via prompt injections. Despite the severity of these vulnerabilities, no existing work has been conducted for a systematic investigation of them. This leaves a great challenge on how to detect vulnerabilities in frameworks as well as LLM-integrated apps in real-world scenarios. To fill this gap, we present two novel strategies, including 1) a static analysis-based tool called LLMSmith to scan the source code of the framework to detect potential RCE vulnerabilities and 2) a prompt-based automated testing approach to verify the vulnerability in LLM-integrated web apps. We discovered 13 vulnerabilities in 6 frameworks, including 12 RCE vulnerabilities and 1 arbitrary file read/write vulnerability. 11 of them are confirmed by the framework developers, resulting in the assignment of 7 CVE IDs. After testing 51 apps, we found vulnerabilities in 17 apps, 16 of which are vulnerable to RCE and 1 to SQL injection. We responsibly reported all 17 issues to the corresponding developers and received acknowledgments. Furthermore, we amplify the attack impact beyond achieving RCE by allowing attackers to exploit other app users (e.g. app responses hijacking, user API key leakage) without direct interaction between the attacker and the victim. Lastly, we propose some mitigating strategies for improving the security awareness of both framework and app developers, helping them to mitigate these risks effectively.

翻译：近年来，大型语言模型（LLM）在各类下游任务中展现出显著潜力。作为关键基础设施的LLM集成框架催生了众多LLM集成网络应用。然而，部分框架存在远程代码执行（RCE）漏洞，攻击者可借助提示注入在应用服务器上远程执行任意代码。尽管这些漏洞危害严重，但现有研究尚未对其展开系统性调查。这导致在真实场景中检测框架及LLM集成应用漏洞面临严峻挑战。为填补这一空白，我们提出两项创新策略：1）基于静态分析的工具LLMSmith，用于扫描框架源代码以检测潜在RCE漏洞；2）基于提示的自动化测试方法，用于验证LLM集成网络应用中的漏洞。我们在6个框架中发现13个漏洞，包括12个RCE漏洞和1个任意文件读写漏洞，其中11个已获框架开发者确认，获分配7个CVE编号。通过测试51款应用，我们在17款应用中发现漏洞（16款存在RCE漏洞，1款存在SQL注入漏洞）。我们已负责任地向相关开发者报告全部17个问题并获致谢。此外，我们突破RCE限制扩展了攻击影响——攻击者可在无需与受害者直接交互的情况下，利用其他应用用户（如劫持应用响应、窃取用户API密钥）。最后，我们提出若干缓解策略，以提升框架与应用开发者的安全防护意识，帮助其有效降低此类风险。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

专知会员服务

15+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日