This work stems from prior complementary observations on the dynamics of Chain-of-Thought (CoT): Large Language Models (LLMs) is shown latent planning of subsequent reasoning prior to CoT emergence, thereby diminishing the significance of explicit CoT; whereas CoT remains critical for tasks requiring multi-step reasoning. To deepen the understanding between LLM's internal states and its verbalized reasoning trajectories, we investigate the latent planning strength of LLMs, through our probing method, Tele-Lens, applying to hidden states across diverse task domains. Our empirical results indicate that LLMs exhibit a myopic horizon, primarily conducting incremental transitions without precise global planning. Leveraging this characteristic, we propose a hypothesis on enhancing uncertainty estimation of CoT, which we validate that a small subset of CoT positions can effectively represent the uncertainty of the entire path. We further underscore the significance of exploiting CoT dynamics, and demonstrate that automatic recognition of CoT bypass can be achieved without performance degradation. Our code, data and models are released at https://github.com/lxucs/tele-lens.
翻译:本研究源于先前对思维链动态的互补性观察:大语言模型在思维链显现前已表现出对后续推理的潜在规划,从而削弱了显式思维链的重要性;然而对于需要多步推理的任务,思维链仍然至关重要。为深入理解大语言模型内部状态与其言语化推理轨迹之间的关系,我们通过探测方法Tele-Lens,将其应用于不同任务领域的隐藏状态,以研究大语言模型的潜在规划强度。实证结果表明,大语言模型表现出近视视野特征,主要进行增量式状态转移而非精确的全局规划。基于此特性,我们提出了增强思维链不确定性估计的假设,并验证了仅需思维链中少量位置即可有效表征完整路径的不确定性。我们进一步强调了利用思维链动态特征的重要性,并证明在不降低性能的前提下可实现思维链旁路的自动识别。相关代码、数据与模型已发布于https://github.com/lxucs/tele-lens。