Open-source scientific software is abundant, yet most tools remain difficult to compile, configure, and reuse, sustaining a small-workshop mode of scientific computing. This deployment bottleneck limits reproducibility, large-scale evaluation, and the practical integration of scientific tools into modern AI-for-Science (AI4S) and agentic workflows. We present Deploy-Master, a one-stop agentic workflow for large-scale tool discovery, build specification inference, execution-based validation, and publication. Guided by a taxonomy spanning 90+ scientific and engineering domains, our discovery stage starts from a recall-oriented pool of over 500,000 public repositories and progressively filters it to 52,550 executable tool candidates under license- and quality-aware criteria. Deploy-Master transforms heterogeneous open-source repositories into runnable, containerized capabilities grounded in execution rather than documentation claims. In a single day, we performed 52,550 build attempts and constructed reproducible runtime environments for 50,112 scientific tools. Each successful tool is validated by a minimal executable command and registered in SciencePedia for search and reuse, enabling direct human use and optional agent-based invocation. Beyond delivering runnable tools, we report a deployment trace at the scale of 50,000 tools, characterizing throughput, cost profiles, failure surfaces, and specification uncertainty that become visible only at scale. These results explain why scientific software remains difficult to operationalize and motivate shared, observable execution substrates as a foundation for scalable AI4S and agentic science.
翻译:开源科学软件资源丰富,但大多数工具仍难以编译、配置与复用,致使科学计算长期处于“小作坊”模式。这一部署瓶颈限制了研究的可复现性、大规模评估以及科学工具与现代AI-for-Science(AI4S)及智能体工作流的实际集成。本文提出Deploy-Master——一个集大规模工具发现、构建规范推断、基于执行的验证及发布功能于一体的一站式智能体工作流。在涵盖90余个科学与工程领域的分类体系指导下,我们的发现阶段从召回导向的50万余个公共代码库池出发,依据许可证与质量感知标准逐步筛选出52,550个可执行工具候选。Deploy-Master将异构的开源代码库转化为基于实际执行(而非文档声明)的可运行、容器化能力单元。我们在单日内完成了52,550次构建尝试,并为50,112个科学工具构建了可复现的运行环境。每个成功部署的工具均通过最小可执行命令进行验证,并注册至SciencePedia平台以供检索与复用,既支持直接人工调用,也可选择基于智能体的调用。除提供可运行工具外,我们首次在五万量级上报告了部署轨迹数据,刻画了仅在大规模实践中才能显现的吞吐量、成本分布、故障层面及规范不确定性。这些结果揭示了科学软件难以投入实际运作的根源,并论证了共享、可观测的执行基座作为可扩展AI4S与智能体科学基础的必要性。