In utilizing large language models (LLMs) for mathematical reasoning, addressing the errors in the reasoning and calculation present in the generated text by LLMs is a crucial challenge. In this paper, we propose a novel framework that integrates the Chain-of-Thought (CoT) method with an external tool (Python REPL). We discovered that by prompting LLMs to generate structured text in XML-like markup language, we could seamlessly integrate CoT and the external tool and control the undesired behaviors of LLMs. With our approach, LLMs can utilize Python computation to rectify errors within CoT. We applied our method to ChatGPT (GPT-3.5) to solve challenging mathematical problems and demonstrated that combining CoT and Python REPL through the markup language enhances the reasoning capability of LLMs. Our approach enables LLMs to write the markup language and perform advanced mathematical reasoning using only zero-shot prompting.
翻译:在利用大型语言模型进行数学推理时,处理生成文本中的推理与计算错误是一项关键挑战。本文提出一种将思维链方法与外部工具(Python REPL)相结合的新型框架。我们发现,通过引导大语言模型生成类XML标记语言的结构化文本,能够无缝整合思维链与外部工具,并有效控制模型的非预期行为。采用该方法后,大语言模型可利用Python计算能力修正思维链中的错误。我们将该方法应用于ChatGPT(GPT-3.5)以解决具有挑战性的数学问题,实验证明通过标记语言结合思维链与Python REPL能增强大语言模型的推理能力。该方法仅需零样本提示即可使大语言模型编写标记语言并执行高级数学推理。