Despite their widespread use, the mechanisms by which large language models (LLMs) represent and regulate uncertainty in next-token predictions remain largely unexplored. This study investigates two critical components believed to influence this uncertainty: the recently discovered entropy neurons and a new set of components that we term token frequency neurons. Entropy neurons are characterized by an unusually high weight norm and influence the final layer normalization (LayerNorm) scale to effectively scale down the logits. Our work shows that entropy neurons operate by writing onto an unembedding null space, allowing them to impact the residual stream norm with minimal direct effect on the logits themselves. We observe the presence of entropy neurons across a range of models, up to 7 billion parameters. On the other hand, token frequency neurons, which we discover and describe here for the first time, boost or suppress each token's logit proportionally to its log frequency, thereby shifting the output distribution towards or away from the unigram distribution. Finally, we present a detailed case study where entropy neurons actively manage confidence in the setting of induction, i.e. detecting and continuing repeated subsequences.
翻译:尽管大型语言模型(LLMs)已被广泛应用,但其在下一词元预测中表示和调控不确定性的机制在很大程度上仍未得到充分探索。本研究考察了被认为影响这种不确定性的两个关键组成部分:近期发现的熵神经元以及我们首次提出并命名为词元频率神经元的一组新组件。熵神经元具有异常高的权重范数特征,通过影响最终层归一化(LayerNorm)的缩放因子来有效降低逻辑值。我们的研究表明,熵神经元通过写入解嵌入零空间来运作,使其能够影响残差流范数,同时对逻辑值本身的直接影响最小。我们在多种参数规模(最高达70亿参数)的模型中均观测到熵神经元的存在。另一方面,我们首次发现并描述的**词元频率神经元**,能够根据每个词元对数频率的比例增强或抑制其逻辑值,从而使输出分布向一元分布靠近或偏离。最后,我们通过一个详细的案例研究,展示了熵神经元在归纳场景(即检测并延续重复子序列)中如何主动管理置信度。