We present THOUGHTSCULPT, a general reasoning and search method for tasks with outputs that can be decomposed into components. THOUGHTSCULPT explores a search tree of potential solutions using Monte Carlo Tree Search (MCTS), building solutions one action at a time and evaluating according to any domain-specific heuristic, which in practice is often simply an LLM evaluator. Critically, our action space includes revision actions: THOUGHTSCULPT may choose to revise part of its previous output rather than continuing to build the rest of its output. Empirically, THOUGHTSCULPT outperforms state-of-the-art reasoning methods across three challenging tasks: Story Outline Improvement (up to +30% interestingness), Mini-Crosswords Solving (up to +16% word success rate), and Constrained Generation (up to +10% concept coverage).
翻译:本文提出THOUGHTSCULPT,一种面向可分解输出任务的通用推理与搜索方法。该方法利用蒙特卡洛树搜索(Monte Carlo Tree Search, MCTS)探索潜在解的搜索树,通过逐步构建解决方案并依据领域特定启发式(实践中通常为大型语言模型评估器)进行评估。关键创新在于动作空间包含修正动作:THOUGHTSCULPT可选择对先前输出进行局部修正,而非继续生成剩余内容。实验表明,在故事大纲改进(趣味性提升最高达30%)、迷你填字游戏求解(单词成功率提升最高达16%)及约束生成(概念覆盖率提升最高达10%)三项具有挑战性任务中,THOUGHTSCULPT均超越现有最优推理方法。