Language agents have shown strong promise for task automation. Realizing this promise for increasingly complex, long-horizon tasks has driven the rise of a sub-agent-as-tools paradigm for multi-turn task solving. However, existing designs still lack a dynamic abstraction view of sub-agents, thereby hurting adaptability. We address this challenge with a unified, framework-agnostic agent abstraction that models any agent as a tuple Instruction, Context, Tools, Model. This tuple acts as a compositional recipe for capabilities, enabling the system to spawn specialized executors for each task on demand. Building on this abstraction, we introduce an agentic system AOrchestra, where the central orchestrator concretizes the tuple at each step: it curates task-relevant context, selects tools and models, and delegates execution via on-the-fly automatic agent creation. Such designs enable reducing human engineering efforts, and remain framework-agnostic with plug-and-play support for diverse agents as task executors. It also enables a controllable performance-cost trade-off, allowing the system to approach Pareto-efficient. Across three challenging benchmarks (GAIA, SWE-Bench, Terminal-Bench), AOrchestra achieves 16.28% relative improvement against the strongest baseline when paired with Gemini-3-Flash. The code is available at: https://github.com/FoundationAgents/AOrchestra
翻译:语言智能体在任务自动化方面展现出巨大潜力。为实现日益复杂、长周期任务的自动化,催生了面向多轮任务求解的“子智能体即工具”范式。然而,现有设计仍缺乏对子智能体的动态抽象视图,从而损害了适应性。我们通过一个统一的、框架无关的智能体抽象来解决这一挑战,该抽象将任何智能体建模为四元组(指令、上下文、工具、模型)。该四元组作为能力组合的配方,使系统能够按需为每个任务生成专用执行器。基于此抽象,我们提出了智能体系统AOrchestra,其核心编排器在每一步具体化该四元组:它策划任务相关上下文、选择工具和模型,并通过即时自动创建智能体来委托执行。此类设计有助于减少人工工程投入,并保持框架无关性,支持多种智能体作为任务执行器的即插即用。它还实现了可控的性能-成本权衡,使系统能够逼近帕累托最优。在三个具有挑战性的基准测试(GAIA、SWE-Bench、Terminal-Bench)中,AOrchestra与Gemini-3-Flash配合使用时,相比最强基线实现了16.28%的相对性能提升。代码已开源:https://github.com/FoundationAgents/AOrchestra