Large language models (LLMs) can achieve highly effective performance on various reasoning tasks by incorporating step-by-step chain-of-thought (CoT) prompting as demonstrations. However, the reasoning chains of demonstrations generated by LLMs are prone to errors, which can subsequently lead to incorrect reasoning during inference. Furthermore, inappropriate exemplars (overly simplistic or complex), can affect overall performance among varying levels of difficulty. We introduce Iter-CoT (Iterative bootstrapping in Chain-of-Thoughts Prompting), an iterative bootstrapping approach for selecting exemplars and generating reasoning chains. By utilizing iterative bootstrapping, our approach enables LLMs to autonomously rectify errors, resulting in more precise and comprehensive reasoning chains. Simultaneously, our approach selects challenging yet answerable questions accompanied by reasoning chains as exemplars with a moderate level of difficulty, which enhances the LLMs' generalizability across varying levels of difficulty. Experimental results indicate that Iter-CoT exhibits superiority, achieving competitive performance across three distinct reasoning tasks on ten datasets.
翻译:大语言模型通过融入逐步推理的链式思维提示作为示例,能在各类推理任务中表现出高效性能。然而,由大语言模型生成的示例推理链易产生错误,进而导致推理阶段的错误推导。此外,不恰当的示例(过于简单或复杂)会影响模型在不同难度层级上的整体表现。我们提出Iter-CoT(链式思维提示的迭代自举方法),这是一种通过迭代自举选择示例并生成推理链的方法。通过利用迭代自举,我们的方法使大语言模型能够自主修正错误,从而生成更精确、更全面的推理链。同时,该方法会选择具有中等难度、包含推理链且可回答的问题作为示例,从而增强大语言模型在不同难度层级间的泛化能力。实验结果表明,Iter-CoT方法展现出优越性,在十个数据集的三种不同推理任务上均取得了具有竞争力的性能。