We present Tongyi DeepResearch, an agentic large language model, which is specifically designed for long-horizon, deep information-seeking research tasks. To incentivize autonomous deep research agency, Tongyi DeepResearch is developed through an end-to-end training framework that combines agentic mid-training and agentic post-training, enabling scalable reasoning and information seeking across complex tasks. We design a highly scalable data synthesis pipeline that is fully automatic, without relying on costly human annotation, and empowers all training stages. By constructing customized environments for each stage, our system enables stable and consistent interactions throughout. Tongyi DeepResearch, featuring 30.5 billion total parameters, with only 3.3 billion activated per token, achieves state-of-the-art performance across a range of agentic deep research benchmarks, including Humanity's Last Exam, BrowseComp, BrowseComp-ZH, WebWalkerQA, xbench-DeepSearch, FRAMES and xbench-DeepSearch-2510. We open-source the model, framework, and complete solutions to empower the community.
翻译:本文提出通义深研(Tongyi DeepResearch),一种专为长周期、深度信息检索研究任务设计的智能体大语言模型。为激励自主深度研究能力,通义深研通过端到端训练框架开发,融合智能体中期训练与智能体后期训练,实现跨复杂任务的可扩展推理与信息检索。我们设计了一个高度可扩展的数据合成流水线,该流水线完全自动化,无需依赖昂贵的人工标注,并为所有训练阶段提供支持。通过为每个阶段构建定制化环境,我们的系统确保了整个过程中稳定且一致的交互。通义深研拥有305亿总参数量,每次推理仅激活33亿参数,在包括Humanity's Last Exam、BrowseComp、BrowseComp-ZH、WebWalkerQA、xbench-DeepSearch、FRAMES及xbench-DeepSearch-2510等多项智能体深度研究基准测试中均达到最优性能。我们开源该模型、框架及完整解决方案,以赋能社区。