Large Language Model (LLM)-based agent systems are increasingly being used for scientific discovery, yet their practical capability remains constrained by a narrow and manually curated tool layer. Much scientific computational capability already exists in open-source repositories, software packages and APIs, but these resources remain difficult to standardize, operationalize and invoke reliably. Here we present ToolRosetta, a framework that equips LLM-based agent systems with scalable, open-world computational access by automatically transforming heterogeneous computational programs into validated, callable tools. ToolRosetta integrates repository retrieval, tool standardization, execution testing, iterative repair and security-aware governance. Across 122 GitHub repositories spanning 35 subdisciplines in 6 domains, ToolRosetta standardizes 1,580 callable tools. These tools support an average verified task success rate of 84.0\% across domains and substantially enhance existing agentic AI systems, including OpenClaw, particularly on out-of-distribution tasks beyond fixed curated tool inventories.
翻译:基于大型语言模型(LLM)的智能体系统正日益被用于科学发现,然而其实际能力仍受限于狭窄且人工精心策划的工具层。大量科学计算能力已存在于开源代码库、软件包和应用程序接口(API)中,但这些资源难以标准化、操作化并可靠调用。本文提出 ToolRosetta 框架,通过自动将异构计算程序转化为经过验证、可调用的工具,为基于 LLM 的智能体系统提供可扩展的开放世界计算访问能力。ToolRosetta 整合了代码库检索、工具标准化、执行测试、迭代修复和安全感知治理。在覆盖 6 个领域 35 个子学科的 122 个 GitHub 代码库中,ToolRosetta 标准化了 1,580 个可调用工具。这些工具在各领域实现了平均 84.0% 的已验证任务成功率,并显著增强了现有智能体 AI 系统(尤其是 OpenClaw),在处理超出固定精选工具库范围的分布外任务时表现尤为突出。