Mathematical reasoning remains a significant challenge for large language models (LLMs), despite progress in prompting techniques such as Chain-of-Thought (CoT). We present Chain of Mathematically Annotated Thought (CoMAT), which enhances reasoning through two stages: Symbolic Conversion (converting natural language queries into symbolic form) and Reasoning Execution (deriving answers from symbolic representations). CoMAT operates entirely with a single LLM and without external solvers. Across four LLMs, CoMAT outperforms traditional CoT on six out of seven benchmarks, achieving gains of 4.48% on MMLU-Redux (MATH) and 4.58% on GaoKao MCQ. In addition to improved performance, CoMAT ensures faithfulness and verifiability, offering a transparent reasoning process for complex mathematical tasks
翻译:尽管在思维链(CoT)等提示技术方面取得了进展,数学推理对于大语言模型(LLMs)而言仍然是一个重大挑战。我们提出了数学标注思维链(CoMAT),它通过两个阶段来增强推理能力:符号转换(将自然语言查询转换为符号形式)和推理执行(从符号表示中推导答案)。CoMAT完全在单个LLM内部运行,无需外部求解器。在四个LLM上,CoMAT在七个基准测试中的六个上优于传统CoT,在MMLU-Redux(MATH)上实现了4.48%的性能提升,在GaoKao MCQ上实现了4.58%的提升。除了性能改进之外,CoMAT确保了忠实性和可验证性,为复杂的数学任务提供了透明的推理过程。