Employing Large Language Models (LLMs) to address mathematical problems is an intriguing research endeavor, considering the abundance of math problems expressed in natural language across numerous science and engineering fields. While several prior works have investigated solving elementary mathematics using LLMs, this work explores the frontier of using GPT-4 for solving more complex and challenging math problems. We evaluate various ways of using GPT-4. Some of them are adapted from existing work, and one is MathChat, a conversational problem-solving framework newly proposed in this work. We perform the evaluation on difficult high school competition problems from the MATH dataset, which shows the advantage of the proposed conversational approach.
翻译:利用大型语言模型(LLMs)解决数学问题是一项引人入胜的研究工作,考虑到众多科学与工程领域中以自然语言表达的大量数学问题。虽然已有若干先行研究探讨了使用LLMs解决初等数学问题,本文则探索了利用GPT-4处理更复杂、更具挑战性的数学问题的前沿领域。我们评估了多种GPT-4的使用方式,其中部分改编自现有研究,而MathChat——一种本文新提出的对话式问题求解框架——则是另一重要方法。我们在MATH数据集的难题高中竞赛题目上进行了评估,结果表明所提出的对话式方法具有显著优势。