Given a user's complex information need, a multi-agent Deep Research system iteratively plans, retrieves, and synthesizes evidence across hundreds of documents to produce a high-quality answer. In one possible architecture, an orchestrator agent coordinates the process, while parallel worker agents execute tasks. Current Deep Research systems, however, often rely on hand-engineered prompts and static architectures, making improvement brittle, expensive, and time-consuming. We therefore explore various multi-agent optimization methods to show that enabling agents to self-play and explore different prompt combinations can produce high-quality Deep Research systems that match or outperform expert-crafted prompts.
翻译:针对用户的复杂信息需求,多智能体深度研究系统通过迭代规划、检索并综合数百份文档中的证据,生成高质量的答案。在一种可行的架构中,编排智能体协调整个流程,而并行工作智能体执行具体任务。然而,当前的深度研究系统往往依赖手工设计的提示词和静态架构,导致优化过程脆弱、成本高昂且耗时。为此,我们探索了多种多智能体优化方法,证明通过让智能体进行自我博弈并探索不同的提示词组合,能够构建出媲美甚至超越专家设计提示词的深度研究系统。