Large language models (LLMs) are increasingly used to automate scientific workflows, yet their integration with heterogeneous computational tools remains ad hoc and fragile. Current agentic approaches often rely on unstructured text to manage context and coordinate execution, generating often overwhelming volumes of information that may obscure decision provenance and hinder auditability. In this work, we present El Agente Gráfico, a single-agent framework that embeds LLM-driven decision-making within a type-safe execution environment and dynamic knowledge graphs for external persistence. Central to our approach is a structured abstraction of scientific concepts and an object-graph mapper that represents computational state as typed Python objects, stored either in memory or persisted in an external knowledge graph. This design enables context management through typed symbolic identifiers rather than raw text, thereby ensuring consistency, supporting provenance tracking, and enabling efficient tool orchestration. We evaluate the system by developing an automated benchmarking framework across a suite of university-level quantum chemistry tasks previously evaluated on a multi-agent system, demonstrating that a single agent, when coupled to a reliable execution engine, can robustly perform complex, multi-step, and parallel computations. We further extend this paradigm to two other large classes of applications: conformer ensemble generation and metal-organic framework design, where knowledge graphs serve as both memory and reasoning substrates. Together, these results illustrate how abstraction and type safety can provide a scalable foundation for agentic scientific automation beyond prompt-centric designs.
翻译:大型语言模型(LLM)正日益广泛地应用于自动化科学工作流,然而其与异构计算工具的集成仍存在临时性与脆弱性。当前的智能体方法通常依赖非结构化文本来管理上下文与协调执行,由此产生的海量信息往往掩盖了决策溯源过程并阻碍了可审计性。本研究提出图形化智能体(El Agente Gráfico)——一个将LLM驱动的决策过程嵌入类型安全执行环境与动态知识图谱以实现外部持久化的单智能体框架。该方法的核心理念在于构建科学概念的结构化抽象层,并通过对象-图谱映射器将计算状态表征为类型化的Python对象,这些对象既可存储于内存中,也可持久化保存于外部知识图谱。该设计通过类型化符号标识符(而非原始文本)实现上下文管理,从而确保一致性、支持溯源追踪,并实现高效的工具编排。我们通过开发自动化基准测试框架,在先前经多智能体系统评估的大学级量子化学任务套件上进行系统验证,结果表明:当单智能体与可靠执行引擎耦合时,能够稳健地执行复杂、多步骤及并行计算。我们进一步将该范式拓展至另外两大应用领域:构象异构体集成生成与金属-有机框架设计,其中知识图谱同时充当记忆存储与推理基底。综合而言,这些研究成果揭示了抽象化与类型安全如何为超越提示词中心化设计的科学智能体自动化提供可扩展的基础架构。