Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning

Despite recent advances in large language models, open-source models often struggle to consistently perform well on complex reasoning tasks. Existing ensemble methods, whether applied at the token or output levels, fail to address these challenges. In response, we present Language model Ensemble with Monte Carlo Tree Search (LE-MCTS), a novel framework for process-level ensembling of language models. LE-MCTS formulates step-by-step reasoning with an ensemble of language models as a Markov decision process. In this framework, states represent intermediate reasoning paths, while actions consist of generating the next reasoning step using one of the language models selected from a predefined pool. Guided by a process-based reward model, LE-MCTS performs a tree search over the reasoning steps generated by different language models, identifying the most accurate reasoning chain. Experimental results on five mathematical reasoning benchmarks demonstrate that our approach outperforms both single language model decoding algorithms and language model ensemble methods. Notably, LE-MCTS improves performance by 3.6% and 4.3% on the MATH and MQA datasets, respectively, highlighting its effectiveness in solving complex reasoning problems.

翻译：尽管大语言模型近期取得了进展，开源模型在复杂推理任务上的表现仍常不稳定。现有的集成方法，无论是在词元层面还是输出层面，均未能有效应对这些挑战。为此，我们提出了一种新颖的语言模型过程级集成框架——基于蒙特卡洛树搜索的语言模型集成方法。该方法将使用语言模型集合进行逐步推理的过程形式化为一个马尔可夫决策过程。在此框架中，状态表示中间推理路径，而动作则涉及从预定义模型池中选择一个语言模型来生成下一个推理步骤。在基于过程的奖励模型引导下，对由不同语言模型生成的推理步骤执行树搜索，从而识别出最准确的推理链。在五个数学推理基准测试上的实验结果表明，我们的方法优于单一语言模型解码算法和语言模型集成方法。值得注意的是，在MATH和MQA数据集上分别实现了3.6%和4.3%的性能提升，突显了其在解决复杂推理问题上的有效性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/