We introduce ontology-to-tools compilation as a proof-of-principle mechanism for coupling large language models (LLMs) with formal domain knowledge. Within The World Avatar (TWA), ontological specifications are compiled into executable tool interfaces that LLM-based agents must use to create and modify knowledge graph instances, enforcing semantic constraints during generation rather than through post-hoc validation. Extending TWA's semantic agent composition framework, the Model Context Protocol (MCP) and associated agents are integral components of the knowledge graph ecosystem, enabling structured interaction between generative models, symbolic constraints, and external resources. An agent-based workflow translates ontologies into ontology-aware tools and iteratively applies them to extract, validate, and repair structured knowledge from unstructured scientific text. Using metal-organic polyhedra synthesis literature as an illustrative case, we show how executable ontological semantics can guide LLM behaviour and reduce manual schema and prompt engineering, establishing a general paradigm for embedding formal knowledge into generative systems.
翻译:我们引入本体到工具的编译作为一种原理验证机制,用于将大型语言模型(LLM)与形式化领域知识耦合。在“世界化身”(The World Avatar, TWA)框架内,本体规范被编译成可执行的工具接口,基于LLM的代理必须使用这些接口来创建和修改知识图谱实例,从而在生成过程中(而非通过事后验证)强制实施语义约束。通过扩展TWA的语义代理组合框架,模型上下文协议(Model Context Protocol, MCP)及相关代理成为知识图谱生态系统的核心组成部分,实现了生成模型、符号约束与外部资源之间的结构化交互。一个基于代理的工作流将本体转换为具备本体感知的工具,并迭代应用这些工具从非结构化科学文本中提取、验证和修复结构化知识。以金属-有机多面体合成文献作为示例案例,我们展示了可执行的本体语义如何引导LLM行为,并减少手动模式设计和提示工程,从而为将形式化知识嵌入生成系统建立了一种通用范式。