Large language model (LLM) agents have begun to delegate work to one another. Protocols such as the Model Context Protocol (MCP) and the Agent2Agent protocol (A2A) let an agent publish what it can do and let others call it, and public registries of such agents are already appearing. These protocols assume an advertised capability is a static, truthful fact. A real agent is none of these things: its competence is probabilistic, varies with input, drifts when the underlying model is updated, and, because the agent is itself a language model, it can describe itself with complete confidence and be wrong. A caller therefore sees what an agent claims to do, not what it can do, with no principled way to tell a reliable provider from a fluent impostor. We argue these difficulties share one cause: the market for lemons. When quality is hidden and claims are cheap, good and bad providers become indistinguishable, honest reliability goes unrewarded, and the market decays toward its worst participants. Economics offers three remedies, signaling, screening, and reputation, and none are present in today's agent protocols. We make four contributions: (1) a failure taxonomy that names confident-wrong as a non-adversarial, correlated subclass of Byzantine faults that classical fault-tolerance mismodels; (2) a market-for-lemons model showing that faith-based protocols admit only a low-trust equilibrium; (3) the Trust Layer, a thin, protocol-agnostic narrow waist above MCP and A2A that adds probabilistic capability descriptors, screening, and reputation, and admits a separating equilibrium when the cost of sustaining an overclaim exceeds the gain from it; and (4) a reliability-composition bound for delegation chains with an end-to-end placement argument. The design needs no model retraining and degrades gracefully when its trust anchors are absent or corrupt.
翻译:摘要:大语言模型智能体已开始相互委托任务。模型上下文协议(MCP)和智能体间协议(A2A)等协议允许智能体发布其能力范围并接受其他智能体调用,此类智能体的公共注册表已开始出现。这些协议假设广告中宣称的能力是静态且真实的事实。然而真实智能体的能力特质完全相反:其完成任务的概率性、随输入的波动性、底层模型更新时的漂移性,以及因其自身是语言模型而可能以完全自信的方式描述自身却产生错误。因此调用方看到的只是智能体声称能做的事,而非其实际能力,且缺乏原则性方法区分可靠提供者与流利冒牌货。我们认为这些困境共享同一个根源:柠檬市场。当质量被隐藏且声明成本低廉时,优劣提供者变得难以区分,诚实的可靠性得不到回报,市场逐渐退化至最差参与者水平。经济学提供了三种解决方案——信号传递、筛选和声誉,但当前的智能体协议中均不存在这些机制。我们做出四项贡献:(1)建立故障分类体系,将"自信-错误"定义为非对抗性、相关性的拜占庭故障子类,经典容错机制对此存在建模偏差;(2)提出柠檬市场模型,证明基于信任的协议仅能容纳低信任均衡;(3)设计信任层——在MCP和A2A之上构建的轻量级、协议无关的窄腰结构,通过添加概率能力描述符、筛选机制和声誉系统,当维持过度声明的成本超过其收益时实现分离均衡;(4)提出带有端到端放置论证的委托链可靠性组合界限。该设计无需模型重新训练,当信任锚点缺失或受损时仍能优雅降级。