Do large language models internally encode ontological relations in a formally verifiable algebraic structure? We introduce Algebraic Ontology Projection (AOP), which projects LLM hidden states into the Galois Field F2 under Liskov Substitution Principle constraints, using only 42 relational pairs as algebraic keys. AOP achieves up to 93.33% zero-shot inclusion accuracy on unseen concept pairs (Gemma-2 Instruct with optimized prompt), with consistent 86.67% accuracy observed across multiple model families -- with no model tuning, but through prompt alone. This algebraic structure is strongly layer-dependent. We introduce Semantic Crystallisation (SC), a metric that quantifies F2 constraint satisfaction relative to a random baseline and predicts zero-shot accuracy without held-out data. System prompts act as algebraic boundary conditions: only their combination with instruction tuning prevents Late-layer Collapse -- a systematic degradation of logical consistency in the final layers, observed in 7 of 10 conditions. These findings reframe forward computation as an iterative process of algebraic organisation, and open a path toward LLMs whose logical structure is not merely approximated, but formally accessible.
翻译:大型语言模型是否在内部以可形式验证的代数结构编码本体关系?我们提出代数本体投影(AOP),该方法在里氏替换原则约束下将大语言模型隐藏状态投影到伽罗瓦域F2中,仅使用42对关系对作为代数密钥。在未见过概念对上,AOP实现了最高93.33%的零样本包含准确率(Gemma-2 Instruct模型配合优化提示),且在多类模型族中观察到一致的86.67%准确率——无需模型调优,仅通过提示工程实现。该代数结构具有显著的层次依赖性。我们引入语义结晶(SC)指标,该指标量化了F2约束满足度相对于随机基线的偏离程度,并可在无留出数据的情况下预测零样本准确率。系统提示充当代数边界条件:仅当其与指令微调共同作用时,才能阻止深层坍缩——一种在最终层出现的逻辑一致性系统性退化现象,在10种实验条件中有7种观测到该现象。这些发现将前向计算重新定义为代数组织的迭代过程,并为构建逻辑结构不仅可近似、更可形式化访问的大语言模型开辟了新路径。