Large language models (LLMs) have unveiled remarkable reasoning capabilities by exploiting chain-of-thought (CoT) prompting, which generates intermediate reasoning chains to serve as the rationale for deriving the answer. However, current CoT methods either simply employ general prompts such as Let's think step by step, or heavily rely on handcrafted task-specific demonstrations to attain preferable performances, thereby engendering an inescapable gap between performance and generalization. To bridge this gap, we propose Meta-CoT, a generalizable CoT prompting method in mixed-task scenarios where the type of input questions is unknown. Meta-CoT firstly categorizes the scenario based on the input question and subsequently constructs diverse demonstrations from the corresponding data pool in an automatic pattern. Meta-CoT simultaneously enjoys remarkable performances on ten public benchmark reasoning tasks and superior generalization capabilities. Notably, Meta-CoT achieves the state-of-the-art result on SVAMP (93.7%) without any additional program-aided methods. Our further experiments on five out-of-distribution datasets verify the stability and generality of Meta-CoT.
翻译:大规模语言模型通过利用链式思维提示方法展现出显著的推理能力,该方法生成中间推理链作为推导答案的依据。然而,当前的链式思维方法要么简单采用"让我们一步步思考"等通用提示,要么严重依赖人工构建的任务特定示例以获得更优性能,从而在性能与泛化性之间形成了难以逾越的鸿沟。为弥合这一差距,我们提出元CoT——一种在输入问题类型未知的混合任务场景中具备泛化能力的链式思维提示方法。元CoT首先基于输入问题对场景进行分类,随后通过自动化模式从相应数据池中构建多样化示例。该方法在十个公开基准推理任务上同时展现了卓越性能和强泛化能力。值得注意的是,元CoT在不借助任何程序辅助方法的情况下,在SVAMP数据集上取得了93.7%的最新最优结果。我们在五个分布外数据集上的进一步实验验证了元CoT的稳定性和通用性。