Premise selection is a fundamental problem of automated theorem proving. Previous works often use intricate symbolic methods, rely on domain knowledge, and require significant engineering effort to solve this task. In this work, we show that Magnushammer, a neural transformer-based approach, can outperform traditional symbolic systems by a large margin. Tested on the PISA benchmark, Magnushammer achieves $59.5\%$ proof rate compared to a $38.3\%$ proof rate of Sledgehammer, the most mature and popular symbolic-based solver. Furthermore, by combining Magnushammer with a neural formal prover based on a language model, we significantly improve the previous state-of-the-art proof rate from $57.0\%$ to $71.0\%$.
翻译:前提选择是自动定理证明中的一个基本问题。先前的研究通常使用复杂的符号方法,依赖领域知识,并需要大量的工程努力来解决这一任务。在本工作中,我们展示了马格努斯锤(Magnushammer)——一种基于神经Transformer的方法——能够大幅度超越传统符号系统。在PISA基准测试上,马格努斯锤达到了$59.5\%$的证明率,而最成熟且广泛使用的基于符号的求解器Sledgehammer的证明率为$38.3\%$。此外,通过将马格努斯锤与基于语言模型的神经形式化证明器相结合,我们将先前的最先进证明率从$57.0\%$显著提升至$71.0\%$。