Agent Primitives: Reusable Latent Building Blocks for Multi-Agent Systems

While existing multi-agent systems (MAS) can handle complex problems by enabling collaboration among multiple agents, they are often highly task-specific, relying on manually crafted agent roles and interaction prompts, which leads to increased architectural complexity and limited reusability across tasks. Moreover, most MAS communicate primarily through natural language, making them vulnerable to error accumulation and instability in long-context, multi-stage interactions within internal agent histories. In this work, we propose \textbf{Agent Primitives}, a set of reusable latent building blocks for LLM-based MAS. Inspired by neural network design, where complex models are built from reusable components, we observe that many existing MAS architectures can be decomposed into a small number of recurring internal computation patterns. Based on this observation, we instantiate three primitives: Review, Voting and Selection, and Planning and Execution. All primitives communicate internally via key-value (KV) cache, which improves both robustness and efficiency by mitigating information degradation across multi-stage interactions. To enable automatic system construction, an Organizer agent selects and composes primitives for each query, guided by a lightweight knowledge pool of previously successful configurations, forming a primitive-based MAS. Experiments show that primitives-based MAS improve average accuracy by 12.0-16.5\% over single-agent baselines, reduce token usage and inference latency by approximately 3$\times$-4$\times$ compared to text-based MAS, while incurring only 1.3$\times$-1.6$\times$ overhead relative to single-agent inference and providing more stable performance across model backbones.

翻译：尽管现有的多智能体系统（MAS）能够通过多个智能体之间的协作处理复杂问题，但这些系统通常高度任务特定化，依赖于人工设计的智能体角色和交互提示，从而导致架构复杂性增加且跨任务可复用性有限。此外，大多数MAS主要通过自然语言进行通信，这使得它们在智能体内部历史记录的长上下文、多阶段交互中容易受到误差累积和不稳定性的影响。本工作提出**智能体基元**——一套用于基于大语言模型的多智能体系统的可复用潜在构建模块。受神经网络设计的启发（复杂模型由可复用组件构建而成），我们观察到许多现有MAS架构可分解为少量重复出现的内部计算模式。基于这一观察，我们实例化了三种基元：评审、投票与选择、规划与执行。所有基元均通过键值（KV）缓存进行内部通信，这通过减轻多阶段交互中的信息衰减，同时提升了系统的鲁棒性和效率。为实现系统自动构建，一个组织者智能体在轻量级知识库（存储先前成功配置）的指导下，为每个查询选择和组合基元，从而形成基于基元的MAS。实验表明，基于基元的MAS相较于单智能体基线平均准确率提升12.0-16.5%，与基于文本的MAS相比，令牌使用量和推理延迟降低约3-4倍，同时仅产生相对于单智能体推理1.3-1.6倍的开销，并在不同模型骨干上提供更稳定的性能。