Training on verifiable symbolic data is a promising way to expand the reasoning frontier of language models beyond what standard pre-training corpora provide. Yet existing procedural generators often rely on fixed puzzles or templates and do not deliver the distributional breadth needed at scale. We introduce Reasoning Core, a scalable suite that procedurally generates verifiable symbolic reasoning data across core formal domains: PDDL planning over randomized domains, first-order logic with equality, context-free grammar parsing and generation, causal reasoning over random Bayesian networks, and systems of equations. Each task is paired with an external solver for rigorous verification and admits continuous difficulty control for curriculum design. Examples can optionally include solver-derived reasoning traces, enabling supervised training from the earliest pre-training stages, and the same interface provides verifiable reward functions for reinforcement learning. Our experiments show that mixing Reasoning Core data into pre-training improves downstream reasoning while preserving, or slightly improving, language modeling quality. Zero-shot evaluations confirm these tasks challenge frontier models such as GPT-5. The code and data are publicly available under the MIT license.
翻译:基于可验证符号数据进行训练,是拓展语言模型推理前沿、超越标准预训练语料库局限的一条前景广阔的途径。然而,现有的程序化生成器通常依赖于固定的谜题或模板,无法提供大规模所需的分布广度。我们推出了推理核心(Reasoning Core),这是一个可扩展的套件,能够程序化生成跨越核心形式化领域的可验证符号推理数据,包括:随机化领域上的PDDL规划、带等式的一阶逻辑、上下文无关文法解析与生成、随机贝叶斯网络上的因果推理,以及方程组求解。每个任务都配有外部求解器以进行严格验证,并支持连续难度控制以用于课程设计。示例可选择性地包含求解器导出的推理轨迹,从而支持从最早期的预训练阶段开始进行监督训练;同一接口还提供可验证的奖励函数,以用于强化学习。我们的实验表明,将推理核心数据混合到预训练中,能在保持(或略微提升)语言建模质量的同时,改善下游推理能力。零样本评估证实,这些任务对GPT-5等前沿模型构成了挑战。代码与数据已根据MIT许可证公开提供。