Language models generate reasoning sequentially, preventing them from decoupling irrelevant exploration paths during search. We introduce Tree-Structured Language Modeling (TSLM), which uses special tokens to encode branching structure, enabling models to generate and selectively expand multiple search paths within a single generation process. By training on complete search trees including both successful and failed attempts, TSLM learns to internalize systematic exploration without redundant recomputation of shared prefixes. TSLM achieves robust performance and superior inference efficiency by avoiding the multiple independent forward passes required by external search methods. These results suggest a new paradigm of inference-time scaling for robust reasoning, demonstrating that supervised learning on complete tree-structured traces provides an efficient alternative for developing systematic exploration capabilities in language models.
翻译:语言模型以序列化方式生成推理过程,这阻碍了其在搜索过程中解耦无关探索路径的能力。本文提出树结构语言建模(TSLM)方法,通过引入特殊标记编码分支结构,使模型能够在单次生成过程中同时生成并选择性扩展多条搜索路径。通过对包含成功与失败尝试的完整搜索树进行训练,TSLM能够内化系统性探索能力,避免对共享前缀的冗余重计算。TSLM通过规避外部搜索方法所需的多轮独立前向传播,实现了鲁棒的性能表现与卓越的推理效率。这些结果表明了一种新的推理时扩展范式,证明基于完整树结构轨迹的监督学习为开发语言模型的系统性探索能力提供了高效替代方案。