Evidence of Meaning in Language Models Trained on Programs

We present evidence that language models can learn meaning despite being trained only to perform next token prediction on text, specifically a corpus of programs. Each program is preceded by a specification in the form of (textual) input-output examples. Working with programs enables us to precisely define concepts relevant to meaning in language (e.g., correctness and semantics), making program synthesis well-suited as an intermediate testbed for characterizing the presence (or absence) of meaning in language models. We first train a Transformer model on the corpus of programs, then probe the trained model's hidden states as it completes a program given a specification. Despite providing no inductive bias toward learning the semantics of the language, we find that a linear probe is able to extract abstractions of both current and future program states from the model states. Moreover, there is a strong, statistically significant correlation between the accuracy of the probe and the model's ability to generate a program that implements the specification. To evaluate whether the semantics are represented in the model states rather than learned by the probe, we design a novel experimental procedure that intervenes on the semantics of the language while preserving the lexicon and syntax. We also demonstrate that the model learns to generate correct programs that are, on average, shorter than those in the training set, which is evidence that language model outputs may differ from the training distribution in semantically meaningful ways. In summary, this paper does not propose any new techniques for training language models, but develops an experimental framework for and provides insights into the acquisition and representation of (formal) meaning in language models.

翻译：我们提出证据表明，尽管语言模型仅接受文本（具体为程序语料库）上的下一词元预测训练，仍能习得意义。每个程序前附有以（文本）输入-输出示例形式呈现的规范。通过处理程序，我们能够精确定义与语言意义相关的概念（如正确性与语义），从而使得程序合成成为表征语言模型中意义存在（或缺失）的理想中间测试平台。我们首先在程序语料库上训练Transformer模型，然后探测训练后模型在给定规范下完成程序时的隐藏状态。尽管未向模型提供任何引导其学习语言语义的归纳偏置，我们发现线性探针能够从模型状态中提取当前及未来程序状态的抽象表征。此外，探针准确率与模型生成实现规范的程序能力之间存在强且统计显著的相关性。为评估语义是表征于模型状态中而非由探针习得，我们设计了一种新颖的实验流程，在保留词汇和句法的前提下干预语言语义。我们还证明，模型学习生成的正确程序平均长度短于训练集程序，这证明语言模型输出可能在语义层面以有意义的方式偏离训练分布。总之，本文未提出任何训练语言模型的新技术，而是为语言模型中（形式化）意义的获取与表征构建了实验框架并提供了洞见。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日