Chain-of-Thought reasoning has significantly enhanced the problem-solving capabilities of Large Language Models. Unfortunately, current models generate reasoning steps sequentially without foresight, often becoming trapped in suboptimal reasoning paths with redundant steps. In contrast, we introduce Neural Chain-of-Thought Search (NCoTS), a framework that reformulates reasoning as a dynamic search for the optimal thinking strategy. By quantitatively characterizing the solution space, we reveal the existence of sparse superior reasoning paths that are simultaneously more accurate and concise than standard outputs. Our method actively navigates towards these paths by evaluating candidate reasoning operators using a dual-factor heuristic that optimizes for both correctness and computational cost. Consequently, NCoTS achieves a Pareto improvement across diverse reasoning benchmarks, boosting accuracy by over 3.5% while reducing generation length by over 22%. Our code and data are available at https://github.com/MilkThink-Lab/Neural-CoT-Search.
翻译:思维链推理显著提升了大语言模型的解题能力。然而,当前模型以顺序方式生成推理步骤,缺乏前瞻性,常陷入包含冗余步骤的次优推理路径。为此,我们提出神经思维链搜索(NCoTS)框架,将推理重新定义为对最优思维策略的动态搜索。通过对解空间进行定量表征,我们揭示了稀疏的优质推理路径的存在,这些路径相比标准输出同时具备更高准确性与更简洁性。我们的方法通过双因子启发式评估候选推理算子,以同步优化正确性与计算成本,从而主动引导搜索朝向这些路径。实验表明,NCoTS在多种推理基准测试中实现了帕累托改进,在准确率提升超过3.5%的同时,生成长度缩减超过22%。代码与数据已开源:https://github.com/MilkThink-Lab/Neural-CoT-Search。