AI agents are emerging as a practical way to run multi-step scientific workflows that interleave reasoning with tool use and verification, pointing to a shift from isolated AI-assisted steps toward \emph{agentic science at scale}. This shift is increasingly feasible, as scientific tools and models can be invoked through stable interfaces and verified with recorded execution traces, and increasingly necessary, as AI accelerates scientific output and stresses the peer-review and publication pipeline, raising the bar for traceability and credible evaluation. However, scaling agentic science remains difficult: workflows are hard to observe and reproduce; many tools and laboratory systems are not agent-ready; execution is hard to trace and govern; and prototype AI Scientist systems are often bespoke, limiting reuse and systematic improvement from real workflow signals. We argue that scaling agentic science requires an infrastructure-and-ecosystem approach, instantiated in Bohrium+SciMaster. Bohrium acts as a managed, traceable hub for AI4S assets -- akin to a HuggingFace of AI for Science -- that turns diverse scientific data, software, compute, and laboratory systems into agent-ready capabilities. SciMaster orchestrates these capabilities into long-horizon scientific workflows, on which scientific agents can be composed and executed. Between infrastructure and orchestration, a \emph{scientific intelligence substrate} organizes reusable models, knowledge, and components into executable building blocks for workflow reasoning and action, enabling composition, auditability, and improvement through use. We demonstrate this stack with eleven representative master agents in real workflows, achieving orders-of-magnitude reductions in end-to-end scientific cycle time and generating execution-grounded signals from real workloads at multi-million scale.
翻译:AI智能体正逐渐成为一种运行多步骤科学工作流的实用方法,这些工作流将推理与工具使用及验证交织在一起,预示着从孤立的AI辅助步骤向**规模化智能体驱动科学**的转变。这一转变日益可行,因为科学工具和模型可通过稳定接口调用,并可通过记录的执行轨迹进行验证;同时也日益必要,因为AI加速了科学产出,并对同行评审与发表流程造成压力,从而提高了可追溯性与可信评估的门槛。然而,规模化智能体驱动科学仍然面临困难:工作流难以观察和复现;许多工具和实验室系统尚未做好智能体就绪准备;执行过程难以追踪和管控;原型AI科学家系统通常为定制化构建,限制了从真实工作流信号中进行复用和系统性改进。我们认为,规模化智能体驱动科学需要一种基础设施与生态系统相结合的方法,并在Bohrium+SciMaster中实现。Bohrium作为一个受管理的、可追溯的AI4S(科学人工智能)资产中心——类似于科学人工智能领域的HuggingFace——将多样化的科学数据、软件、计算资源和实验室系统转化为智能体就绪的能力。SciMaster将这些能力编排为长视野的科学工作流,科学智能体可在其上组合并执行。在基础设施与编排之间,一个**科学智能基座**将可复用的模型、知识和组件组织成可执行的构建块,用于工作流推理与行动,从而实现组合性、可审计性以及通过使用进行改进。我们通过十一个代表性主智能体在真实工作流中演示了这一技术栈,实现了端到端科学周期时间的数量级缩减,并从数百万规模的真实工作负载中生成了基于执行的信号。