Formalizing the Safety, Security, and Functional Properties of Agentic AI Systems

Agentic AI systems, which leverage multiple autonomous agents and large language models (LLMs), are increasingly used to address complex, multi-step tasks. The safety, security, and functionality of these systems are critical, especially in high-stakes applications. However, the current ecosystem of inter-agent communication is fragmented, with protocols such as the Model Context Protocol (MCP) for tool access and the Agent-to-Agent (A2A) protocol for coordination being analyzed in isolation. This fragmentation creates a semantic gap that prevents the rigorous analysis of system properties and introduces risks such as architectural misalignment and exploitable coordination issues. To address these challenges, we introduce a modeling framework for agentic AI systems composed of two central models: (1) the host agent model formalizes the top-level entity that interacts with the user, decomposes tasks, and orchestrates their execution by leveraging external agents and tools; (2) the task lifecycle model details the states and transitions of individual sub-tasks from creation to completion, providing a fine-grained view of task management and error handling. Together, these models provide a unified semantic framework for reasoning about the behavior of multi-AI agent systems. Grounded in this framework, we define 16 properties for the host agent and 14 for the task lifecycle, categorized into liveness, safety, completeness, and fairness. Expressed in temporal logic, these properties enable formal verification of system behavior, detection of coordination edge cases, and prevention of deadlocks and security vulnerabilities. Through this effort, we introduce the first rigorously grounded, domain-agnostic framework for the analysis, design, and deployment of correct, reliable, and robust agentic AI systems.

翻译：智能体AI系统利用多个自主智能体和大语言模型（LLM）来应对复杂多步任务。此类系统的安全性、安保性与功能性至关重要，尤其是在高风险应用中。然而，当前智能体间通信生态较为碎片化，例如用于工具访问的模型上下文协议（MCP）和用于协调的智能体间（A2A）协议均孤立地进行分析。这种碎片化造成了语义鸿沟，阻碍了对系统属性的严格分析，并引入了架构失调、协调漏洞可被利用等风险。为解决这些问题，我们提出一个由两个核心模型组成的智能体AI系统建模框架：（1）宿主智能体模型形式化了与用户交互、分解任务并借助外部智能体与工具协调执行的高层实体；（2）任务生命周期模型详细描述了单个子任务从创建到完成的各状态与转换过程，提供了任务管理与错误处理的细粒度视图。这两个模型共同构成了推理多智能体AI系统行为的统一语义框架。在此框架基础上，我们为宿主智能体定义了16项属性，为任务生命周期定义了14项属性，分别归类为活性、安全性、完整性和公平性。这些属性以时序逻辑表达，能够实现系统行为的形式化验证、协调边界情况的检测以及死锁与安全漏洞的预防。通过此项工作，我们首次提出了一个严格奠基且领域无关的框架，用于分析、设计和部署正确、可靠且鲁棒的智能体AI系统。