FGeo-TP: A Language Model-Enhanced Solver for Geometry Problems

The application of contemporary artificial intelligence techniques to address geometric problems and automated deductive proof has always been a grand challenge to the interdiscipline field of mathematics and artificial Intelligence. This is the fourth article in a series of our works, in our previous work, we established of a geometric formalized system known as FormalGeo. Moreover we annotated approximately 7000 geometric problems, forming the FormalGeo7k dataset. Despite the FGPS (Formal Geometry Problem Solver) can achieve interpretable algebraic equation solving and human-like deductive reasoning, it often experiences timeouts due to the complexity of the search strategy. In this paper, we introduced FGeo-TP (Theorem Predictor), which utilizes the language model to predict theorem sequences for solving geometry problems. We compared the effectiveness of various Transformer architectures, such as BART or T5, in theorem prediction, implementing pruning in the search process of FGPS, thereby improving its performance in solving geometry problems. Our results demonstrate a significant increase in the problem-solving rate of the language model-enhanced FGeo-TP on the FormalGeo7k dataset, rising from 39.7% to 80.86%. Furthermore, FGeo-TP exhibits notable reductions in solving time and search steps across problems of varying difficulty levels.

翻译：当代人工智能技术在几何问题求解与自动演绎证明中的应用，始终是数学与人工智能交叉领域的重大挑战。本文是我们系列研究的第四篇工作。在前期研究中，我们建立了名为FormalGeo的几何形式化系统，并标注了约7000道几何问题，形成了FormalGeo7k数据集。尽管FGPS（形式化几何问题求解器）能够实现可解释的代数方程求解和类人演绎推理，但受限于搜索策略的复杂性，常出现超时现象。本文引入FGeo-TP（定理预测器），利用语言模型预测几何问题求解所需的定理序列。我们比较了BART、T5等不同Transformer架构在定理预测中的有效性，通过实施FGPS搜索过程剪枝，提升了几何问题求解性能。实验结果表明，经语言模型增强的FGeo-TP在FormalGeo7k数据集上的解题成功率显著提升，从39.7%跃升至80.86%。此外，FGeo-TP在不同难度等级的问题求解中，均表现出求解时间与搜索步数的显著降低。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日