A Layered Protocol Architecture for the Internet of Agents

Large Language Models (LLMs) have demonstrated remarkable performance improvements and the ability to learn domain-specific languages (DSLs), including APIs and tool interfaces. This capability has enabled the creation of AI agents that can perform preliminary computations and act through tool calling, which is now being standardized via protocols like MCP. However, LLMs face fundamental limitations: their context windows cannot grow indefinitely, restricting their memory and computational capacity. Agent collaboration emerges as essential for solving increasingly complex problems, mirroring how computational systems rely on different types of memory to scale. The "Internet of Agents" (IoA) represents the communication stack that enables agents to scale by distributing computation across collaborating entities. Current network architectural stacks (OSI and TCP/IP) were designed for data delivery between hosts and processes, not for agent collaboration with semantic understanding. To address this gap, we propose two new layers: an Agent Communication Layer (L8) and an Agent Semantic Layer (L9). L8 formalizes the structure of communication, standardizing message envelopes, speech-act performatives (e.g., REQUEST, INFORM), and interaction patterns (e.g., request-reply, publish-subscribe), building on protocols like MCP. The proposed L9 layer: (1) formalizes semantic context discovery and negotiation, (2) provides semantic grounding by binding terms to semantic context, and (3) semantically validates incoming prompts and performs disambiguation as needed. Furthermore, L9 introduces primitives for coordination and consensus, allowing agents to achieve alignment on shared states, collective goals, and distributed beliefs. Together, these layers provide the foundation for scalable, distributed agent collaboration, enabling the next generation of multi-agentic systems.

翻译：大型语言模型（LLM）已展现出显著的性能提升以及学习领域特定语言（DSL）的能力，包括API和工具接口。这一能力使得创建能够执行初步计算并通过工具调用进行操作的AI智能体成为可能，目前正通过MCP等协议实现标准化。然而，大型语言模型面临根本性限制：其上下文窗口无法无限增长，制约了其记忆与计算能力。智能体协作对于解决日益复杂的问题变得至关重要，这类似于计算系统依赖不同类型的内存以实现扩展。"智能体互联网"（IoA）代表了通过将计算分布在协作实体间以实现扩展的通信栈。当前的网络架构栈（OSI与TCP/IP）是为主机与进程间的数据交付而设计，并非为具备语义理解的智能体协作而设计。为弥补这一差距，我们提出了两个新层：智能体通信层（L8）与智能体语义层（L9）。L8基于MCP等协议，将通信结构形式化，标准化消息信封、言语行为施为语（如REQUEST、INFORM）以及交互模式（如请求-应答、发布-订阅）。所提出的L9层则：（1）将语义上下文发现与协商形式化，（2）通过将术语绑定到语义上下文提供语义基础，（3）对输入的提示进行语义验证并在必要时执行消歧。此外，L9引入了用于协调与共识的原语，使智能体能够在共享状态、集体目标和分布式信念上达成一致。这些层共同为可扩展的分布式智能体协作奠定了基础，从而赋能下一代多智能体系统。