Artificial Intelligence for Theorem Proving has given rise to a plethora of benchmarks and methodologies, particularly in Interactive Theorem Proving (ITP). Research in the area is fragmented, with a diverse set of approaches being spread across several ITP systems. This presents a significant challenge to the comparison of methods, which are often complex and difficult to replicate. Addressing this, we present BAIT, a framework for fair and streamlined comparison of learning approaches in ITP. We demonstrate BAIT's capabilities with an in-depth comparison, across several ITP benchmarks, of state-of-the-art architectures applied to the problem of formula embedding. We find that Structure Aware Transformers perform particularly well, improving on techniques associated with the original problem sets. BAIT also allows us to assess the end-to-end proving performance of systems built on interactive environments. This unified perspective reveals a novel end-to-end system that improves on prior work. We also provide a qualitative analysis, illustrating that improved performance is associated with more semantically-aware embeddings. By streamlining the implementation and comparison of Machine Learning algorithms in the ITP context, we anticipate BAIT will be a springboard for future research.
翻译:人工智能定理证明催生了大量基准测试和方法,尤其是在交互式定理证明(ITP)领域。该领域的研究呈现碎片化特征,多种不同方法分散于多个ITP系统中,这给常因复杂而难以复现的方法对比带来了重大挑战。针对这一问题,我们提出BAIT——一个用于公平且流式对比ITP学习方法框架。我们通过跨多个ITP基准的深度对比展示了BAIT的能力,这些对比聚焦于应用于公式嵌入问题的当前最优架构。研究发现,结构感知Transformer表现尤为出色,其性能优于与原始问题集相关联的技术。BAIT还能使我们评估基于交互环境构建的系统端到端证明能力,这种统一视角揭示了一种超越先前工作的新型端到端系统。我们还进行了定性分析,表明性能提升与更具语义感知的嵌入相关联。通过简化ITP场景下机器学习算法的实现与对比流程,我们预期BAIT将成为未来研究的跳板。