Intern-Atlas: A Methodological Evolution Graph as Research Infrastructure for AI Scientists

Existing research infrastructure is fundamentally document-centric, providing citation links between papers but lacking explicit representations of methodological evolution. In particular, it does not capture the structured relationships that explain how and why research methods emerge, adapt, and build upon one another. With the rise of AI-driven research agents as a new class of consumers of scientific knowledge, this limitation becomes increasingly consequential, as such agents cannot reliably reconstruct method evolution topologies from unstructured text. We introduce Intern-Atlas, a methodological evolution graph that automatically identifies method-level entities, infers lineage relationships among methodologies, and captures the bottlenecks that drive transitions between successive innovations. Built from 1,030,314 papers spanning AI conferences, journals, and arXiv preprints, the resulting graph comprises 9,410,201 semantically typed edges, each grounded in verbatim source evidence, forming a queryable causal network of methodological development. To operationalize this structure, we further propose a self-guided temporal tree search algorithm for constructing evolution chains that trace the progression of methods over time. We evaluate the quality of the resulting graph against expert-curated ground-truth evolution chains and observe strong alignment. In addition, we demonstrate that Intern-Atlas enables downstream applications in idea evaluation and automated idea generation. We position methodological evolution graphs as a foundational data layer for the emerging automated scientific discovery.

翻译：现有研究基础设施本质上是基于文档的，仅提供论文间的引用链接，缺乏对方法论演化的显式表示。具体而言，这类基础设施未能捕捉能够解释研究方法如何及为何产生、适应和相互演进的逻辑结构关系。随着以人工智能驱动的研究智能体作为科学知识新消费群体的崛起，这一局限愈发凸显——此类智能体无法从非结构化文本中可靠地重建方法演化拓扑结构。我们提出Intern-Atlas方法演化图，该图能够自动识别方法级实体、推断方法间的谱系关系，并捕捉驱动创新迭代的核心瓶颈。基于涵盖人工智能会议、期刊及arXiv预印本在内的1,030,314篇论文构建，该图包含9,410,201条语义类型化边，每条边均有逐字原始证据支撑，形成可查询的方法论发展因果网络。为使该结构具备可操作性，我们进一步提出自引导时序树搜索算法，用于构建追踪方法随时间演进历程的演化链。在与专家标注的真值演化链对比评估中，该图展现出高度一致性。此外，我们证实Intern-Atlas可赋能创意评估与自动化创意生成等下游应用。我们将方法论演化图定位为新兴自动化科学发现的基础数据层。