Large language models have achieved impressive performance on reasoning tasks spanning mathematics, science, programming, and commonsense inference. Despite these advances, their reasoning processes remain largely latent, making them difficult to interpret, verify, replay, debug, and transfer across domains. Existing approaches such as chain-of-thought, tree-of-thoughts, graph-of-thoughts, and tool-augmented reasoning expose intermediate reasoning artifacts but typically lack explicit execution semantics, formal state representations, and verifiable reasoning structures. We introduce Theorem-Grounded Execution Ontologies (TGEO), a framework that models reasoning as an executable state-transition process rather than a sequence of generated tokens. Given an input problem, TGEO identifies relevant theorem families, binds the problem to a domain ontology, discovers semantic objects, instantiates states and operators, constructs predicates and contracts, and synthesizes an executable reasoning graph. The resulting graph provides an interpretable, replayable, and auditable representation of reasoning in which every state transition, operator application, and validation step is explicitly represented. TGEO integrates five architectural components: (1) theorem-grounded reasoning priors, (2) executable ontologies, (3) operator-mediated state transitions, (4) predicate and contract-based execution validation, and (5) architectural auditing and failure localization. We evaluate TGEO on theorem-intensive reasoning tasks derived from mathematical benchmark domains and a curated Golden Execution Suite. Our findings demonstrate the value of executable reasoning representations for interpretable, verifiable, and reproducible AI reasoning systems.
翻译:大型语言模型在数学、科学、编程及常识推理等任务上取得了显著性能。然而,其推理过程仍高度隐式,导致难以解释、验证、重放、调试及跨领域迁移。现有方法(如思维链、思维树、思维图及工具增强推理)虽暴露了中间推理产物,但通常缺乏显式执行语义、形式化状态表示及可验证的推理结构。我们提出定理基础执行本体论(TGEO),该框架将推理建模为可执行的状态转移过程,而非生成的符号序列。对于给定的输入问题,TGEO识别相关定理族,将问题绑定到领域本体,发现语义对象,实例化状态与操作符,构建谓词与契约,并合成可执行的推理图。该推理图提供了可解释、可重放及可审计的推理表示,其中每一步状态转移、操作符应用及验证步骤均被显式表示。TGEO整合了五个架构组件:(1)定理基础的推理先验,(2)可执行本体,(3)操作符介导的状态转移,(4)基于谓词与契约的执行验证,(5)架构审计与失败定位。我们在数学基准领域及自建Golden执行套件中的定理密集型推理任务上评估了TGEO。实验结果表明,可执行推理表示对构建可解释、可验证、可复现的AI推理系统具有重要价值。