In this paper, we investigate the output token probability information in the output embedding of language models. We find an approximate common log-linear encoding of output token probabilities within the output embedding vectors and empirically demonstrate that it is accurate and sparse. As a causality examination, we steer the encoding in output embedding to modify the output probability distribution accurately. Moreover, the sparsity we find in output probability encoding suggests that a large number of dimensions in the output embedding do not contribute to causal language modeling. Therefore, we attempt to delete the output-unrelated dimensions and find more than 30% of the dimensions can be deleted without significant movement in output distribution and sequence generation. Additionally, in the pre-training dynamics of language models, we find that the output embeddings capture the corpus token frequency information in early steps, even before an obvious convergence of parameters starts.
翻译:本文研究了语言模型输出嵌入中的输出词元概率信息。我们在输出嵌入向量中发现了一种近似的对数线性编码方式,用于表示输出词元的概率,并通过实验证明该编码方式具有准确性和稀疏性。作为因果性检验,我们通过操控输出嵌入中的编码,实现了对输出概率分布的精确修改。此外,输出概率编码的稀疏性表明,输出嵌入中的大量维度对因果语言建模并无贡献。因此,我们尝试删除与输出无关的维度,发现超过30%的维度可以在不影响输出分布和序列生成的情况下被移除。另外,在语言模型的预训练动态过程中,我们发现输出嵌入在早期阶段(甚至在参数明显收敛之前)就已捕获了语料库词元的频率信息。