Whether Large Language Models (LLMs) develop coherent internal world models remains a core debate. While conventional Next-Token Prediction (NTP) focuses on one-step-ahead supervision, Multi-Token Prediction (MTP) has shown promise in learning more structured representations. In this work, we provide a theoretical perspective analyzing the gradient inductive bias of MTP, supported by empirical evidence, showing that MTP promotes the convergence toward internal belief states by inducing representational contractivity via gradient coupling. However, we reveal that standard MTP often suffers from structural hallucinations, where discrete token supervision encourages illegal shortcuts in latent space that violate environmental constraints. To address this, we propose a novel method Latent Semantic Enhancement MTP (LSE-MTP), which anchors predictions to ground-truth hidden state trajectories. Experiments on synthetic graphs and real-world Manhattan Taxi Ride show that LSE-MTP effectively bridges the gap between discrete tokens and continuous state representations, enhancing representation alignment, reducing structural hallucinations, and improving robustness to perturbations.
翻译:大型语言模型(LLMs)能否形成连贯的内部世界模型仍是核心争议。尽管传统单步预测(Next-Token Prediction, NTP)仅关注单步监督信号,但多步预测(Multi-Token Prediction, MTP)已在学习结构化表征方面展现出潜力。本研究从理论视角分析了MTP的梯度归纳偏差,并通过实验证据表明:MTP通过梯度耦合诱导表征压缩性,从而促进模型向内部信念状态收敛。然而我们揭示,标准MTP常因离散标记监督的固有缺陷引发结构幻觉——模型在潜在空间学习到违反环境约束的非法捷径。为此,我们提出潜在语义增强型多步预测(Latent Semantic Enhancement MTP, LSE-MTP),该方法将预测锚定于真实隐状态轨迹。在合成图数据与真实场景曼哈顿出租车行程数据集上的实验表明,LSE-MTP有效弥合了离散标记与连续状态表征间的鸿沟,提升了表征对齐度,减少了结构幻觉,并增强了对扰动的鲁棒性。