Deep research has emerged as a transformative capability for autonomous agents, empowering Large Language Models to navigate complex, open-ended tasks. However, realizing its full potential is hindered by critical limitations, including escalating contextual noise in long-horizon tasks, fragility leading to cascading errors, and a lack of modular extensibility. To address these challenges, we introduce Yunque DeepResearch, a hierarchical, modular, and robust framework. The architecture is characterized by three key components: (1) a centralized Multi-Agent Orchestration System that routes subtasks to an Atomic Capability Pool of tools and specialized sub-agents; (2) a Dynamic Context Management mechanism that structures completed sub-goals into semantic summaries to mitigate information overload; and (3) a proactive Supervisor Module that ensures resilience through active anomaly detection and context pruning. Yunque DeepResearch achieves state-of-the-art performance across a range of agentic deep research benchmarks, including GAIA, BrowseComp, BrowseComp-ZH, and Humanity's Last Exam. We open-source the framework, reproducible implementations, and application cases to empower the community.
翻译:深度研究已成为自主智能体的一项变革性能力,使大型语言模型能够处理复杂、开放式的任务。然而,其全部潜力的实现受到若干关键限制的阻碍,包括长周期任务中不断加剧的上下文噪声、导致级联错误的脆弱性以及模块化可扩展性的缺乏。为应对这些挑战,我们提出了云雀深度研究,一个分层、模块化且鲁棒的框架。该架构具有三个核心组件:(1) 一个集中式的多智能体编排系统,将子任务路由至由工具和专用子智能体构成的原子能力池;(2) 一种动态上下文管理机制,将已完成的子目标组织成语义摘要以缓解信息过载;(3) 一个主动的监督模块,通过主动异常检测和上下文剪枝来确保系统的韧性。云雀深度研究在一系列智能体深度研究基准测试中取得了最先进的性能,包括GAIA、BrowseComp、BrowseComp-ZH和Humanity's Last Exam。我们开源了该框架、可复现的实现及应用案例,以赋能研究社区。