Open-Ended Deep Research (OEDR) pushes LLM agents beyond short-form QA toward long-horizon workflows that iteratively search, connect, and synthesize evidence into structured reports. However, existing OEDR agents largely follow either linear ``search-then-generate'' accumulation or outline-centric planning. The former suffers from lost-in-the-middle failures as evidence grows, while the latter relies on the LLM to implicitly infer knowledge gaps from the outline alone, providing weak supervision for identifying missing relations and triggering targeted exploration. We present DualGraph memory, an architecture that separates what the agent knows from how it writes. DualGraph maintains two co-evolving graphs: an Outline Graph (OG), and a Knowledge Graph (KG), a semantic memory that stores fine-grained knowledge units, including core entities, concepts, and their relations. By analyzing the KG topology together with structural signals from the OG, DualGraph generates targeted search queries, enabling more efficient and comprehensive iterative knowledge-driven exploration and refinement. Across DeepResearch Bench, DeepResearchGym, and DeepConsult, DualGraph consistently outperforms state-of-the-art baselines in report depth, breadth, and factual grounding; for example, it reaches a 53.08 RACE score on DeepResearch Bench with GPT-5. Moreover, ablation studies confirm the central role of the dual-graph design.
翻译:开放深度研究将大型语言模型智能体从简短问答推向长时程工作流,通过迭代搜索、连接与综合证据形成结构化报告。然而,现有开放深度研究智能体主要遵循线性"先搜索后生成"的累积模式或大纲中心式规划。前者在证据增长时易陷入"中间迷失"困境,后者仅依赖大型语言模型从大纲隐式推断知识缺口,对识别缺失关系和触发定向探索的监督较弱。我们提出双图记忆架构,将智能体已知内容与写作方式相分离。该架构维护两个协同演化的图结构:大纲图与知识图——后者作为语义记忆存储细粒度知识单元,包括核心实体、概念及其关联关系。通过结合知识图拓扑分析和大纲图的结构信号,双图记忆能生成定向搜索查询,实现更高效、更全面的迭代式知识驱动探索与精炼。在DeepResearch Bench、DeepResearchGym和DeepConsult三大基准测试中,双图记忆在报告深度、广度与事实依据性方面持续超越现有最优基线模型;例如在DeepResearch Bench基准上配合GPT-5达到53.08的RACE评分。消融实验进一步验证了双图设计的核心作用。