Large Language Models (LLMs) exhibit various emergent abilities. Among these abilities, some might reveal the internal working mechanisms of models. In this paper, we uncover a novel emergent capability in models: the intrinsic ability to perform extended sequences of calculations without relying on chain-of-thought step-by-step solutions. Remarkably, the most advanced models can directly output the results of two-digit number additions with lengths extending up to 15 addends. We hypothesize that the model emerges Implicit Discrete State Representations (IDSRs) within its hidden states and performs symbolic calculations internally. To test this hypothesis, we design a sequence of experiments that look into the hidden states. Specifically, we first confirm that IDSRs exist. Then, we provide interesting observations about the formation of IDSRs from layer, digit, and sequence perspectives. Finally, we confirm that models indeed use IDSRs to produce the final answers. However, we also discover that these state representations are far from lossless in current open-sourced models, leading to inaccuracies in their final performance. Our work presents a novel exploration of LLMs' symbolic calculation abilities and the underlying mechanisms.
翻译:大语言模型展现出多种涌现能力。在这些能力中,部分可能揭示模型的内部工作机制。本文中,我们揭示了模型的一种新颖涌现能力:在不依赖思维链逐步求解的情况下,执行长序列计算的内在能力。值得注意的是,最先进的模型能够直接输出长度延伸至15个加数的两位数加法结果。我们假设模型在其隐藏状态中涌现出隐式离散状态表示,并在内部执行符号计算。为验证该假设,我们设计了一系列深入探究隐藏状态的实验。具体而言,我们首先确认了隐式离散状态表示的存在性。随后,我们从层级、数位和序列三个维度提供了关于隐式离散状态表示形成过程的有趣观察。最后,我们证实模型确实利用隐式离散状态表示生成最终答案。然而,我们也发现当前开源模型中的这些状态表示远非无损,导致其最终性能存在误差。本研究为大语言模型的符号计算能力及其底层机制提供了新的探索视角。