Autoregressive LLMs perform well on relational tasks that require linking entities via relational words (e.g., father/son, friend), but it is unclear whether they learn the logical semantics of such relations (e.g., symmetry and inversion logic) and, if so, whether reversal-type failures arise from missing relational semantics or left-to-right order bias. We propose a controlled Knowledge Graph-based synthetic framework that generates text from symmetric/inverse triples, train GPT-style autoregressive models from scratch, and evaluate memorization, logical inference, and in-context generalization to unseen entities to address these questions. We find a sharp phase transition in which relational semantics emerge with sufficient logic-bearing supervision, even in shallow (2-3 layer) models, and that successful generalization aligns with stable intermediate-layer signals. Finally, order-matched forward/reverse tests and a diffusion baseline indicate that reversal failures are primarily driven by autoregressive order bias rather than deficient inversion semantics.
翻译:自回归大语言模型在需要通过关系词(如父亲/儿子、朋友)链接实体的关系任务上表现良好,但尚不清楚它们是否学习了这些关系的逻辑语义(例如对称性和反转逻辑),若已学习,反转类失败是由缺失的关系语义还是从左到右的顺序偏差所致。为解决这些问题,我们提出了一种基于知识图谱的可控合成框架,从对称/反转三元组生成文本,从头训练GPT风格的自回归模型,并评估记忆、逻辑推理及对未见实体的上下文泛化能力。我们发现存在一个尖锐的相变点:当逻辑承载监督信号充足时,关系语义会涌现出来,即使在浅层(2-3层)模型中也是如此,且成功泛化与稳定的中间层信号相一致。最后,顺序匹配的正向/逆向测试及扩散基线表明,反转失败主要由自回归顺序偏差而非反转语义缺陷驱动。