We introduce the Lattice Deduction Transformer (LDT), a recurrent transformer that approximates logically sound deduction by projecting its latent state through a lattice between forward passes. We train on-policy in a process that mirrors deduction in a search-based constraint solver and supervise training via a domain-agnostic, abstract-interpretation-based approximation of the set of solution candidates. An $800$K-parameter LDT achieves $100\%$ accuracy on Sudoku-Extreme and Snowflake Sudoku, at a fraction of the training cost of prior small recurrent reasoners, while remaining empirically sound: the model returns a correct answer or abstains. A $1.8$M-parameter variant reaches $99.9\%$ accuracy on Maze-Hard. Frontier LLMs score $0\%$ on all three benchmarks.
翻译:我们提出格点演绎变换器(Lattice Deduction Transformer, LDT),这是一种循环变换器架构,通过在前向传播之间将隐状态投影至格点结构来近似逻辑严谨的演绎推理。我们采用同策略(on-policy)训练方式,模拟基于搜索的约束求解器中的演绎过程,并通过一种领域无关的、基于抽象解释(abstract interpretation)的解候选集近似方法进行训练监督。一个拥有80万参数的LDT模型在Sudoku-Extreme和Snowflake Sudoku任务上达到100%的准确率,其训练成本仅为先前小型循环推理器的极小部分,同时保持经验上的可靠性:模型要么返回正确答案,要么弃权。一个180万参数的变体在Maze-Hard任务上达到99.9%的准确率。前沿大语言模型(Frontier LLMs)在这三个基准测试上的得分为0%。