Autoformalization involves automatically translating informal math into formal theorems and proofs that are machine-verifiable. Euclidean geometry provides an interesting and controllable domain for studying autoformalization. In this paper, we introduce a neuro-symbolic framework for autoformalizing Euclidean geometry, which combines domain knowledge, SMT solvers, and large language models (LLMs). One challenge in Euclidean geometry is that informal proofs rely on diagrams, leaving gaps in texts that are hard to formalize. To address this issue, we use theorem provers to fill in such diagrammatic information automatically, so that the LLM only needs to autoformalize the explicit textual steps, making it easier for the model. We also provide automatic semantic evaluation for autoformalized theorem statements. We construct LeanEuclid, an autoformalization benchmark consisting of problems from Euclid's Elements and the UniGeo dataset formalized in the Lean proof assistant. Experiments with GPT-4 and GPT-4V show the capability and limitations of state-of-the-art LLMs on autoformalizing geometry problems. The data and code are available at https://github.com/loganrjmurphy/LeanEuclid.
翻译:自动形式化是指将非形式化的数学内容自动转换为机器可验证的形式化定理与证明。欧几里得几何为研究自动形式化提供了一个有趣且可控的领域。本文提出了一种用于欧几里得几何自动形式化的神经符号框架,该框架结合了领域知识、可满足性模理论求解器以及大语言模型。欧几里得几何中的一个挑战在于,非形式化证明依赖于图表,导致文本中存在难以形式化的信息缺口。为解决此问题,我们使用定理证明器自动填补此类图表信息,从而使大语言模型仅需对显式的文本步骤进行自动形式化,降低了模型处理的难度。我们还为自动形式化的定理陈述提供了自动语义评估。我们构建了LeanEuclid——一个在Lean证明助手中形式化的自动形式化基准,其问题来源于《几何原本》与UniGeo数据集。使用GPT-4与GPT-4V进行的实验展示了当前最先进的大语言模型在处理几何问题自动形式化方面的能力与局限。数据与代码可在 https://github.com/loganrjmurphy/LeanEuclid 获取。