Autoregressive LLMs perform well on relational tasks that require linking entities via relational words (e.g., father/son, friend), but it is unclear whether they learn the logical semantics of such relations (e.g., symmetry and inversion logic) and, if so, whether reversal-type failures arise from missing relational semantics or left-to-right order bias. We propose a controlled Knowledge Graph-based synthetic framework that generates text from symmetric/inverse triples, train GPT-style autoregressive models from scratch, and evaluate memorization, logical inference, and in-context generalization to unseen entities to address these questions. We find a sharp phase transition in which relational semantics emerge with sufficient logic-bearing supervision, even in shallow (2-3 layer) models, and that successful generalization aligns with stable intermediate-layer signals. Finally, order-matched forward/reverse tests and a diffusion baseline indicate that reversal failures are primarily driven by autoregressive order bias rather than deficient inversion semantics.
翻译:自回归大型语言模型在需要借助关系词(如父子、朋友)连接实体的关系任务上表现良好,但尚不清楚它们是否习得了此类关系的逻辑语义(如对称性与逆反逻辑),以及若已习得,反转类失败究竟源于缺失的关系语义还是自左向右的顺序偏差。我们提出一个基于知识图谱的受控合成框架,该框架从对称/逆反三元组生成文本,从头训练GPT风格的自回归模型,并通过评估记忆能力、逻辑推理能力以及对未见实体的上下文泛化能力来探究上述问题。我们发现了一个尖锐的相变现象:在获得足够的逻辑承载监督后,关系语义会涌现出来——即使在浅层(2-3层)模型中也是如此,且成功的泛化与稳定的中间层信号相关联。最后,通过顺序匹配的正向/反向测试以及一个扩散基线模型表明,反转失败主要由自回归顺序偏差驱动,而非逆反语义的缺失。