Large Language Models (LLMs) are evolving into autonomous agents capable of executing complex workflows via standardized protocols (e.g., MCP). However, this paradigm shifts control from deterministic code to probabilistic inference, creating a fundamental Trust-Authorization Mismatch: static permissions are structurally decoupled from the agent's fluctuating runtime trustworthiness. In this Systematization of Knowledge (SoK), we survey more than 200 representative papers to categorize the emerging landscape of agent security. We propose the Belief-Intention-Permission (B-I-P) framework as a unifying formal lens. By decomposing agent execution into three distinct stages-Belief Formation, Intent Generation, and Permission Grant-we demonstrate that diverse threats, from prompt injection to tool poisoning, share a common root cause: the desynchronization between dynamic trust states and static authorization boundaries. Using the B-I-P lens, we systematically map existing attacks and defenses and identify critical gaps where current mechanisms fail to bridge this mismatch. Finally, we outline a research agenda for shifting from static Role-Based Access Control (RBAC) to dynamic, risk-adaptive authorization.
翻译:大型语言模型(LLM)正在演变为能够通过标准化协议(如 MCP)执行复杂工作流的自主智能体。然而,这种范式将控制权从确定性代码转移至概率性推理,从而产生了一个根本性的信任-授权失配问题:静态权限在结构上与智能体动态变化的运行时可信度相脱节。在本系统化知识综述中,我们调研了超过 200 篇代表性论文,以对新兴的智能体安全领域进行分类。我们提出了信念-意图-权限(B-I-P)框架作为一个统一的正式分析视角。通过将智能体执行过程分解为三个不同阶段——信念形成、意图生成和权限授予——我们证明,从提示注入到工具污染等各类威胁,其根本原因均在于动态信任状态与静态授权边界之间的失同步。运用 B-I-P 视角,我们系统地梳理了现有攻击与防御措施,并识别出当前机制未能弥合此失配的关键缺口。最后,我们提出了一个研究议程,旨在推动从静态的基于角色的访问控制(RBAC)转向动态的、风险自适应的授权机制。