Large language model (LLM) services have recently begun offering a plugin ecosystem to interact with third-party API services. This innovation enhances the capabilities of LLMs, but it also introduces risks, as these plugins developed by various third parties cannot be easily trusted. This paper proposes a new attacking framework to examine security and safety vulnerabilities within LLM platforms that incorporate third-party services. Applying our framework specifically to widely used LLMs, we identify real-world malicious attacks across various domains on third-party APIs that can imperceptibly modify LLM outputs. The paper discusses the unique challenges posed by third-party API integration and offers strategic possibilities to improve the security and safety of LLM ecosystems moving forward. Our code is released at https://github.com/vk0812/Third-Party-Attacks-on-LLMs.
翻译:大语言模型服务近期开始提供插件生态系统,以支持与第三方API服务进行交互。这一创新增强了LLM的能力,但也引入了风险——由于这些由不同第三方开发的插件难以被完全信赖。本文提出了一种新型攻击框架,用于审视集成第三方服务的LLM平台中的安全与脆弱性。通过将该框架专门应用于广泛使用的LLM,我们识别出跨不同领域的第三方API实际恶意攻击,这些攻击能轻微改动LLM输出而不被察觉。论文探讨了第三方API集成带来的独特挑战,并为未来提升LLM生态系统的安全韧性提供了战略可行路径。我们的代码已发布在https://github.com/vk0812/Third-Party-Attacks-on-LLMs。