Pre-trained language models are effective in a variety of natural language tasks, but it has been argued their capabilities fall short of fully learning meaning or understanding language. To understand the extent to which language models can learn some form of meaning, we investigate their ability to capture semantics of code beyond superficial frequency and co-occurrence. In contrast to previous research on probing models for linguistic features, we study pre-trained models in a setting that allows for objective and straightforward evaluation of a model's ability to learn semantics. In this paper, we examine whether such models capture the semantics of code, which is precisely and formally defined. Through experiments involving the manipulation of code fragments, we show that code pre-trained models of code learn a robust representation of the computational semantics of code that goes beyond superficial features of form alone
翻译:预训练语言模型在各种自然语言任务中表现出色,但有人认为它们的能力尚未达到完全学习意义或理解语言的程度。为了探究语言模型能在多大程度上学习某种意义,我们研究了它们捕捉代码语义的能力——这种能力超越了表面的频率和共现关系。与先前针对语言特征探针模型的研究不同,我们在一个能够客观直接评估模型语义学习能力的设定下研究预训练模型。本文通过操作代码片段的实验,检验了这些模型是否能捕捉到代码的语义——这种语义是精确且形式化定义的。结果表明,代码预训练模型学习到了代码计算语义的稳健表征,这种表征超越了仅基于形式的表面特征。