We present El Agente Estructural, a multimodal, natural-language-driven geometry-generation and manipulation agent for autonomous chemistry and molecular modelling. Unlike molecular generation or editing via generative models, Estructural mimics how human experts directly manipulate molecular systems in three dimensions by integrating a comprehensive set of domain-informed tools and vision-language models. This design enables precise control over atomic or functional group replacements, atomic connectivity, and stereochemistry without the need to rebuild extensive core molecular frameworks. Through a series of representative case studies, we demonstrate that Estructural enables chemically meaningful geometry manipulation across a wide range of real-world scenarios. These include site-selective functionalization, ligand binding, ligand exchange, stereochemically controlled structure construction, isomer interconversion, fragment-level structural analysis, image-guided generation of structures from schematic reaction mechanisms, and mechanism-driven geometry generation and modification. These examples illustrate how multimodal reasoning, when combined with specialized geometry-aware tools, supports interactive and context-aware molecular modelling beyond structure generation. Looking forward, the integration of Estructural into El Agente Quntur, an autonomous multi-agent quantum chemistry platform, enhances its capabilities by adding sophisticated tools for the generation and editing of three-dimensional structures.
翻译:我们提出了结构智能体(El Agente Estructural),这是一种多模态、自然语言驱动的几何结构生成与操控智能体,用于自主化学与分子建模。与通过生成模型进行分子生成或编辑不同,结构智能体通过整合一套全面的领域知识工具和视觉-语言模型,模拟人类专家在三维空间中直接操控分子系统的方式。这种设计使得我们能够精确控制原子或官能团的替换、原子连接性以及立体化学,而无需重建庞大的核心分子骨架。通过一系列代表性案例研究,我们证明结构智能体能够在广泛的现实场景中实现具有化学意义的几何结构操控。这些场景包括:位点选择性功能化、配体结合、配体交换、立体化学控制的结构构建、异构体互变、片段级结构分析、从示意性反应机理的图像引导结构生成,以及机理驱动的几何结构生成与修饰。这些示例说明了多模态推理与专门的几何感知工具相结合,如何支持超越结构生成的交互式、上下文感知的分子建模。展望未来,将结构智能体集成到自主多智能体量子化学平台El Agente Quntur中,通过增加用于三维结构生成与编辑的复杂工具,进一步增强了其能力。