Large Language Models (LLMs) have shown strong performance in solving mathematical problems, with code-based solutions proving particularly effective. However, the best practice to leverage coding instruction data to enhance mathematical reasoning remains underexplored. This study investigates three key questions: (1) How do different coding styles of mathematical code-based rationales impact LLMs' learning performance? (2) Can general-domain coding instructions improve performance? (3) How does integrating textual rationales with code-based ones during training enhance mathematical reasoning abilities? Our findings reveal that code-based rationales with concise comments, descriptive naming, and hardcoded solutions are beneficial, while improvements from general-domain coding instructions and textual rationales are relatively minor. Based on these insights, we propose CoinMath, a learning strategy designed to enhance mathematical reasoning by diversifying the coding styles of code-based rationales. CoinMath generates a variety of code-based rationales incorporating concise comments, descriptive naming conventions, and hardcoded solutions. Experimental results demonstrate that CoinMath significantly outperforms its baseline model, MAmmoTH, one of the SOTA math LLMs.
翻译:大型语言模型(LLMs)在解决数学问题方面展现出强大性能,其中基于代码的解决方案被证明尤为有效。然而,如何最佳利用编码指令数据来提升数学推理能力仍缺乏深入探索。本研究探讨了三个关键问题:(1)不同编码风格的数学代码推理过程如何影响LLMs的学习性能?(2)通用领域编码指令能否提升模型表现?(3)在训练过程中将文本推理与代码推理相结合如何增强数学推理能力?我们的研究结果表明,具有简洁注释、描述性命名和硬编码解决方案的代码推理方式是有益的,而通用领域编码指令和文本推理带来的改进相对有限。基于这些发现,我们提出了CoinMath——一种通过多样化代码推理风格来增强数学推理能力的学习策略。CoinMath生成包含简洁注释、描述性命名规范和硬编码解决方案的多样化代码推理过程。实验结果表明,CoinMath显著优于其基线模型MAmmoTH(当前最先进的数学LLMs之一)。