While Chain of Thought (CoT) prompting approaches have significantly consolidated the reasoning capabilities of large language models (LLMs), they still face limitations that require extensive human effort or have performance needs to be improved. Existing endeavors have focused on bridging these gaps; however, these approaches either hinge on external data and cannot completely eliminate manual effort, or they fall short in effectively directing LLMs to generate high-quality exemplary prompts. To address the said pitfalls, we propose a novel prompt approach for automatic reasoning named \textbf{LBS3}, inspired by curriculum learning which better reflects human learning habits. Specifically, LBS3 initially steers LLMs to recall easy-to-hard proxy queries that are pertinent to the target query. Following this, it invokes a progressive strategy that utilizes exemplary prompts stemmed from easy-proxy queries to direct LLMs in solving hard-proxy queries, enabling the high-quality of the proxy solutions. Finally, our extensive experiments in various reasoning-intensive tasks with varying open- and closed-source LLMs show that LBS3 achieves strongly competitive performance compared to the SOTA baselines.
翻译:尽管思维链(CoT)提示方法显著增强了大语言模型(LLMs)的推理能力,其仍面临需要大量人工干预或性能有待提升的局限性。现有研究致力于弥合这些差距,然而这些方法要么依赖外部数据且无法完全消除人工参与,要么在有效引导LLMs生成高质量示例提示方面存在不足。为解决上述缺陷,受更符合人类学习习惯的课程学习思想启发,我们提出了一种名为**LBS3**的新型自动推理提示方法。具体而言,LBS3首先引导LLMs回忆与目标查询相关的由易到难的代理查询,随后采用渐进式策略:利用源自简单代理查询的示例提示来引导LLMs求解困难代理查询,从而确保代理解决方案的高质量生成。最后,我们在多种推理密集型任务中,针对不同开源与闭源LLMs开展的广泛实验表明,相较于当前最先进的基线方法,LBS3展现出极具竞争力的性能表现。