If You Want Coherence, Orchestrate a Team of Rivals: Multi-Agent Models of Organizational Intelligence

AI Agents can perform complex operations at great speed, but just like all the humans we have ever hired, their intelligence remains fallible. Miscommunications aren't noticed, systemic biases have no counter-action, and inner monologues are rarely written down. We did not come to fire them for their mistakes, but to hire them and provide a safe productive working environment. We posit that we can reuse a common corporate organizational structure: teams of independent AI agents with strict role boundaries can work with common goals, but opposing incentives. Multiple models serving as a team of rivals can catch and minimize errors within the final product at a small cost to the velocity of actions. In this paper we demonstrate that we can achieve reliability without acquiring perfect components, but through careful orchestration of imperfect ones. This paper describes the architecture of such a system in practice: specialized agent teams (planners, executors, critics, experts), organized into an organization with clear goals, coordinated through a remote code executor that keeps data transformations and tool invocations separate from reasoning models. Rather than agents directly calling tools and ingesting full responses, they write code that executes remotely; only relevant summaries return to agent context. By preventing raw data and tool outputs from contaminating context windows, the system maintains clean separation between perception (brains that plan and reason) and execution (hands that perform heavy data transformations and API calls). We demonstrate the approach achieves over 90% internal error interception prior to user exposure while maintaining acceptable latency tradeoffs. A survey from our traces shows that we only trade off cost and latency to achieve correctness and incrementally expand capabilities without impacting existing ones.

翻译：AI智能体能够以极快的速度执行复杂操作，但正如我们曾雇佣的所有人类一样，其智能仍存在缺陷。沟通失误未被察觉，系统性偏见缺乏应对措施，内心独白鲜有记录。我们并非因其错误而解雇它们，而是为了雇佣它们并提供一个安全高效的工作环境。我们认为可以复用一种常见的企业组织结构：具有严格角色边界、共同目标但对立激励的独立AI智能体团队。由多个模型构成的对手团队能够以轻微的行动速度为代价，在最终产品中捕捉并最小化错误。本文论证了我们无需获取完美组件即可实现可靠性，而是通过对不完美组件的精心编排来实现。本文描述了该系统的实际架构：由专业智能体团队（规划者、执行者、评审者、专家）构成，这些团队被组织成具有明确目标的结构，通过远程代码执行器进行协调，使数据转换和工具调用与推理模型保持分离。智能体不直接调用工具并接收完整响应，而是编写在远程执行的代码；仅有相关摘要返回至智能体上下文。通过防止原始数据和工具输出污染上下文窗口，系统保持了感知（进行规划和推理的大脑）与执行（执行繁重数据转换和API调用的双手）之间的清晰分离。实验表明该方法能在用户接触前实现超过90%的内部错误拦截，同时保持可接受的延迟权衡。对运行轨迹的分析显示，我们仅需权衡成本与延迟即可实现正确性，并能逐步扩展功能而不影响现有能力。