States Hidden in Hidden States: LLMs Emerge Discrete State Representations Implicitly

Large Language Models (LLMs) exhibit various emergent abilities. Among these abilities, some might reveal the internal working mechanisms of models. In this paper, we uncover a novel emergent capability in models: the intrinsic ability to perform extended sequences of calculations without relying on chain-of-thought step-by-step solutions. Remarkably, the most advanced models can directly output the results of two-digit number additions with lengths extending up to 15 addends. We hypothesize that the model emerges Implicit Discrete State Representations (IDSRs) within its hidden states and performs symbolic calculations internally. To test this hypothesis, we design a sequence of experiments that look into the hidden states. Specifically, we first confirm that IDSRs exist. Then, we provide interesting observations about the formation of IDSRs from layer, digit, and sequence perspectives. Finally, we confirm that models indeed use IDSRs to produce the final answers. However, we also discover that these state representations are far from lossless in current open-sourced models, leading to inaccuracies in their final performance. Our work presents a novel exploration of LLMs' symbolic calculation abilities and the underlying mechanisms.

翻译：大语言模型展现出多种涌现能力。在这些能力中，部分可能揭示模型的内部工作机制。本文中，我们揭示了模型的一种新颖涌现能力：在不依赖思维链逐步求解的情况下，执行长序列计算的内在能力。值得注意的是，最先进的模型能够直接输出长度延伸至15个加数的两位数加法结果。我们假设模型在其隐藏状态中涌现出隐式离散状态表示，并在内部执行符号计算。为验证该假设，我们设计了一系列深入探究隐藏状态的实验。具体而言，我们首先确认了隐式离散状态表示的存在性。随后，我们从层级、数位和序列三个维度提供了关于隐式离散状态表示形成过程的有趣观察。最后，我们证实模型确实利用隐式离散状态表示生成最终答案。然而，我们也发现当前开源模型中的这些状态表示远非无损，导致其最终性能存在误差。本研究为大语言模型的符号计算能力及其底层机制提供了新的探索视角。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日