In recent years, agentic artificial intelligence (AI) systems are becoming increasingly widespread. These systems allow agents to use various tools, such as web browsers, compilers, and more. However, despite their popularity, agentic AI systems also introduce a myriad of security concerns, due to their constant interaction with third-party servers. For example, a malicious adversary can cause data exfiltration by executing prompt injection attacks, as well as other unwarranted behavior. These security concerns have recently motivated researchers to improve the safety and reliability of agentic systems. However, most of the literature on this topic is from the AI standpoint and lacks the system-security perspective and guarantees. In this work, we begin bridging this gap and present an analysis through the lens of classic cybersecurity research. Specifically, motivated by decades of progress in this domain, we identify short- and long-term research problems in agentic AI safety by examining end-to-end security properties of entire systems, rather than standalone AI models running in isolation. Our key goal is to examine where research challenges arise when applying traditional security principles in the context of AI agents and, as a secondary goal, distill these ideas for AI practitioners. Furthermore, we extensively cover 11 case studies of real-world attacks on agentic systems, as well as define a series of new research problems that are specific to this important domain.
翻译:近年来,智能体人工智能(AI)系统正变得越来越普及。这些系统允许智能体使用各种工具,例如网络浏览器、编译器及其他工具。然而,尽管其日益流行,智能体AI系统也因其与第三方服务器的持续交互而引入了诸多安全问题。例如,恶意攻击者可通过执行提示注入攻击导致数据泄露,以及其他未经授权的行为。这些安全问题近来促使研究人员致力于提升智能体系统的安全性与可靠性。然而,该主题的大多数文献均从AI视角出发,缺乏系统安全层面的视角与保障。在本工作中,我们开始弥合这一差距,并通过经典网络安全研究的视角展开分析。具体而言,受该领域数十年进展的启发,我们通过审视整个系统的端到端安全属性(而非孤立运行的独立AI模型),识别出智能体AI安全领域短期与长期的研究问题。我们的核心目标是探究在AI智能体语境下应用传统安全原则时所产生的研究挑战,次要目标则是为AI从业者提炼这些理念。此外,我们广泛涵盖了11个针对智能体系统的真实攻击案例研究,并定义了一系列该重要领域特有的新研究问题。