This is the first paper in a series of work we have accomplished over the past three years. In this paper, we have constructed a complete and compatible formal plane geometry system. This will serve as a crucial bridge between IMO-level plane geometry challenges and readable AI automated reasoning. Within this formal framework, we have been able to seamlessly integrate modern AI models with our formal system. AI is now capable of providing deductive reasoning solutions to IMO-level plane geometry problems, just like handling other natural languages, and these proofs are readable, traceable, and verifiable. We propose the geometry formalization theory (GFT) to guide the development of the geometry formal system. Based on the GFT, we have established the FormalGeo, which consists of 88 geometric predicates and 196 theorems. It can represent, validate, and solve IMO-level geometry problems. we also have crafted the FGPS (formal geometry problem solver) in Python. It serves as both an interactive assistant for verifying problem-solving processes and an automated problem solver. We've annotated the formalgeo7k and formalgeo-imo datasets. The former contains 6,981 (expand to 133,818 through data augmentation) geometry problems, while the latter includes 18 (expand to 2,627 and continuously increasing) IMO-level challenging geometry problems. All annotated problems include detailed formal language descriptions and solutions. Implementation of the formal system and experiments validate the correctness and utility of the GFT. The backward depth-first search method only yields a 2.42% problem-solving failure rate, and we can incorporate deep learning techniques to achieve lower one. The source code of FGPS and datasets are available at https://github.com/BitSecret/FGPS.
翻译:这是我们过去三年完成的一系列工作中的第一篇论文。本文构建了一个完整且兼容的形式化平面几何系统,这将作为连接IMO级平面几何挑战与可读AI自动推理的关键桥梁。在该形式化框架内,我们得以将现代AI模型与形式化系统无缝集成。如今,AI能够像处理其他自然语言一样,为IMO级平面几何问题提供演绎推理解决方案,且这些证明具有可读性、可追溯性和可验证性。我们提出几何形式化理论(GFT)以指导几何形式系统的开发。基于GFT,我们建立了FormalGeo,该系统包含88个几何谓词和196个定理,能够表示、验证和求解IMO级几何问题。同时,我们用Python构建了FGPS(形式化几何问题求解器),它既可作为验证解题过程的交互式助手,也可作为自动化问题求解器。我们标注了formalgeo7k和formalgeo-imo数据集:前者包含6,981个(通过数据增强扩展至133,818个)几何问题,后者包含18个(扩展至2,627个且持续增加)IMO级挑战性几何问题。所有标注问题均包含详细的形式化语言描述与解答。形式化系统的实现与实验验证了GFT的正确性与实用性。向后深度优先搜索方法仅产生2.42%的解题失败率,且可融入深度学习技术进一步降低该比率。FGPS的源代码与数据集已开源至https://github.com/BitSecret/FGPS。