Chain-of-thought (CoT) prompting is a de-facto standard technique to elicit reasoning-like responses from large language models (LLMs), allowing them to spell out individual steps before giving a final answer. While the resemblance to human-like reasoning is undeniable, the driving forces underpinning the success of CoT reasoning still remain largely unclear. In this work, we perform an in-depth analysis of CoT traces originating from competition-level mathematics questions, with the aim of better understanding how, and which parts of CoT actually contribute to the final answer. To this end, we introduce the notion of a potential, quantifying how much a given part of CoT increases the likelihood of a correct completion. Upon examination of reasoning traces through the lens of the potential, we identify surprising patterns including (1) its often strong non-monotonicity (due to reasoning tangents), (2) very sharp but sometimes tough to interpret spikes (reasoning insights and jumps) as well as (3) at times lucky guesses, where the model arrives at the correct answer without providing any relevant justifications before. While some of the behaviours of the potential are readily interpretable and align with human intuition (such as insights and tangents), others remain difficult to understand from a human perspective. To further quantify the reliance of LLMs on reasoning insights, we investigate the notion of CoT transferability, where we measure the potential of a weaker model under the partial CoT from another, stronger model. Indeed aligning with our previous results, we find that as little as 20% of partial CoT can ``unlock'' the performance of the weaker model on problems that were previously unsolvable for it, highlighting that a large part of the mechanics underpinning CoT are transferable.
翻译:思维链(CoT)提示是一种事实上的标准技术,用于从大型语言模型(LLM)中引出类推理的响应,使其能够在给出最终答案前逐步阐述推理过程。尽管其与人类推理的相似性无可否认,但支撑CoT推理成功的驱动力在很大程度上仍不明确。在本工作中,我们对源自竞赛级数学问题的CoT轨迹进行了深入分析,旨在更好地理解CoT如何以及哪些部分实际上对最终答案做出了贡献。为此,我们引入了“潜力”的概念,用于量化CoT的给定部分在多大程度上增加了正确完成的可能性。通过潜力视角审视推理轨迹,我们识别出了一些令人惊讶的模式,包括:(1)其经常表现出的强烈非单调性(源于推理偏离),(2)非常尖锐但有时难以解释的峰值(推理洞察与跳跃),以及(3)有时出现的幸运猜测,即模型在未提供任何相关论证之前便得出了正确答案。虽然潜力的一些行为易于解释且符合人类直觉(如洞察与偏离),但其他行为从人类视角来看仍难以理解。为了进一步量化LLM对推理洞察的依赖程度,我们研究了CoT可迁移性的概念,即测量一个较弱模型在另一个较强模型的部分CoT下的潜力。确实,与我们之前的结果一致,我们发现仅需20%的部分CoT即可“解锁”较弱模型在先前无法解决的问题上的性能,这突显了支撑CoT机制的大部分内容是可迁移的。