Employing Large Language Models (LLMs) to address mathematical problems is an intriguing research endeavor, considering the abundance of math problems expressed in natural language across numerous science and engineering fields. LLMs, with their generalized ability, are used as a foundation model to build AI agents for different tasks. In this paper, we study the effectiveness of utilizing LLM agents to solve math problems through conversations. We propose MathChat, a conversational problem-solving framework designed for math problems. MathChat consists of an LLM agent and a user proxy agent which is responsible for tool execution and additional guidance. This synergy facilitates a collaborative problem-solving process, where the agents engage in a dialogue to solve the problems. We perform evaluation on difficult high school competition problems from the MATH dataset. Utilizing Python, we show that MathChat can further improve previous tool-using prompting methods by 6%.
翻译:利用大型语言模型(LLM)解决数学问题是一项引人关注的研究方向,鉴于众多科学与工程领域存在大量以自然语言表述的数学问题。LLM凭借其泛化能力,可作为构建面向不同任务的AI智能体的基础模型。本文研究了通过对话形式运用LLM智能体解决数学问题的有效性。我们提出MathChat——一个专为数学问题设计的对话式求解框架。MathChat包含一个LLM智能体和一个用户代理智能体,后者负责工具执行与附加指导。这种协同机制促成了协作式问题求解过程,智能体通过对话交互共同解决问题。我们在MATH数据集中的高难度高中竞赛题上进行评估。通过使用Python工具,我们证明MathChat能将现有工具调用提示方法的性能提升6%。