Mathematical reasoning presents a significant challenge to the cognitive capabilities of LLMs. Various methods have been proposed to enhance the mathematical ability of LLMs. However, few recognize the value of state transition for LLM reasoning. In this work, we define mathematical problem-solving as a process of transiting from an initial unsolved state to the final resolved state, and propose Kwai-STaR framework, which transforms LLMs into State-Transition Reasoners to improve their intuitive reasoning capabilities. Our approach comprises three main steps: (1) Define the state space tailored to the mathematical reasoning. (2) Generate state-transition data based on the state space. (3) Convert original LLMs into State-Transition Reasoners via a curricular training strategy. Our experiments validate the effectiveness of Kwai-STaR in enhancing mathematical reasoning: After training on the small-scale Kwai-STaR dataset, general LLMs, including Mistral-7B and LLaMA-3, achieve considerable performance gain on the GSM8K and GSM-Hard dataset. Additionally, the state transition-based design endows Kwai-STaR with remarkable training and inference efficiency. Further experiments are underway to establish the generality of Kwai-STaR.
翻译:数学推理对大语言模型的认知能力提出了重大挑战。已有多种方法被提出以增强大语言模型的数学能力。然而,鲜有研究认识到状态转移对于大语言模型推理的价值。在本工作中,我们将数学问题求解定义为从初始未解状态转换至最终解决状态的过程,并提出Kwai-STaR框架,该框架将大语言模型转化为状态转移推理器以提升其直觉推理能力。我们的方法包含三个主要步骤:(1) 针对数学推理定义状态空间。(2) 基于该状态空间生成状态转移数据。(3) 通过课程学习训练策略将原始大语言模型转化为状态转移推理器。我们的实验验证了Kwai-STaR在增强数学推理方面的有效性:在小型Kwai-STaR数据集上进行训练后,包括Mistral-7B和LLaMA-3在内的通用大语言模型在GSM8K和GSM-Hard数据集上均取得了显著的性能提升。此外,基于状态转移的设计赋予了Kwai-STaR卓越的训练与推理效率。进一步的实验正在进行中,以确立Kwai-STaR的普适性。