AgentGuardian: Learning Access Control Policies to Govern AI Agent Behavior

Artificial intelligence (AI) agents are increasingly used in a variety of domains to automate tasks, interact with users, and make decisions based on data inputs. Ensuring that AI agents perform only authorized actions and handle inputs appropriately is essential for maintaining system integrity and preventing misuse. In this study, we introduce the AgentGuardian, a novel security framework that governs and protects AI agent operations by enforcing context-aware access-control policies. During a controlled staging phase, the framework monitors execution traces to learn legitimate agent behaviors and input patterns. From this phase, it derives adaptive policies that regulate tool calls made by the agent, guided by both real-time input context and the control flow dependencies of multi-step agent actions. Evaluation across two real-world AI agent applications demonstrates that AgentGuardian effectively detects malicious or misleading inputs while preserving normal agent functionality. Moreover, its control-flow-based governance mechanism mitigates hallucination-driven errors and other orchestration-level malfunctions.

翻译：人工智能（AI）智能体正日益广泛地应用于各领域，以实现任务自动化、与用户交互以及基于数据输入进行决策。确保AI智能体仅执行授权操作并妥善处理输入，对于维护系统完整性和防止滥用至关重要。本研究提出AgentGuardian，这是一种新颖的安全框架，通过执行上下文感知的访问控制策略来管理和保护AI智能体操作。在受控的预演阶段，该框架通过监控执行轨迹来学习合法的智能体行为与输入模式。基于此阶段，它推导出自适应策略，这些策略在实时输入上下文与多步智能体动作的控制流依赖关系的共同指导下，对智能体发起的工具调用进行约束。在两个真实世界AI智能体应用中的评估表明，AgentGuardian能有效检测恶意或误导性输入，同时保持正常的智能体功能。此外，其基于控制流的治理机制缓解了由幻觉驱动的错误及其他编排层面的功能异常。

相关内容

关注 7107

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

伯克利最新《智能体 AI (Agentic AI)》课程

专知会员服务

48+阅读 · 3月1日

智能体化 AI 与网络安全综述：挑战、机遇与用例原型

专知会员服务

29+阅读 · 1月13日

专业软件开发者不靠“氛围编程”（Vibe Coding），而靠“控制”：2025 年 AI Agent 在编程中的应用研究

专知会员服务

21+阅读 · 2025年12月31日

智能体工程（Agent Engineering）

专知会员服务

35+阅读 · 2025年12月31日