Large language models (LLMs) have shown impressive emergent abilities in a wide range of tasks, but still face challenges in handling complex reasoning problems. Previous works like chain-of-thought (CoT) and tree-of-thoughts (ToT) have predominately focused on enhancing accuracy, but overlook the rapidly increasing token cost, which could be particularly problematic for open-ended real-world tasks with huge solution spaces. Motivated by the dual process theory of human cognition, we propose "Synergy of Thoughts" (SoT) to unleash the synergistic potential of hybrid LLMs for efficient reasoning. By default, SoT uses smaller-scale language models to generate multiple low-cost reasoning thoughts, which resembles the parallel intuitions produced by System 1. If these intuitions exhibit conflicts, SoT will invoke the reflective reasoning of scaled-up language models to emulate the intervention of System 2, which will override the intuitive thoughts and rectify the reasoning process. This framework is model-agnostic and training-free, which can be flexibly implemented with various off-the-shelf LLMs. Experiments on six representative reasoning tasks show that SoT substantially reduces the token cost by 38.3%-75.1%, and simultaneously achieves state-of-the-art reasoning accuracy and solution diversity. Notably, the average token cost reduction on open-ended tasks reaches up to 69.1%. Code repo with all prompts will be released upon publication.
翻译:大型语言模型(LLM)在广泛任务中展现出令人瞩目的涌现能力,但在处理复杂推理问题时仍面临挑战。先前研究如思维链(CoT)和思维树(ToT)主要聚焦于提升准确性,却忽视了快速增长的令牌成本——这对于具有巨大解空间的开放式现实任务尤为突出。受人类认知双重加工理论的启发,我们提出"思维协同"(SoT)框架,以释放混合LLM在高效推理中的协同潜能。默认情况下,SoT使用较小规模的语言模型生成多个低成本的推理思路,这类似于系统1产生的并行直觉。当这些直觉出现冲突时,SoT将调用增强型语言模型的反思性推理来模拟系统2的干预机制,覆盖直觉思维并修正推理过程。该框架具备模型无关性与免训练特性,可灵活适配多种现成的LLM。在六项代表性推理任务上的实验表明,SoT显著降低令牌成本达38.3%-75.1%,同时实现了最先进的推理准确性与解决方案多样性。值得注意的是,在开放式任务上的平均令牌成本降低率高达69.1%。所有提示模板的代码仓库将在论文发表时同步开源。