Software Engineering (SE) Pre-trained Language Models (PLMs), such as CodeBERT, are pre-trained on large code corpora, and their learned knowledge has shown success in transferring into downstream tasks (e.g., code clone detection) through the fine-tuning of PLMs. In Natural Language Processing (NLP), an alternative in transferring the knowledge of PLMs is explored through the use of adapter, a compact and parameter efficient module that is inserted into a PLM. Although the use of adapters has shown promising results in many NLP-based downstream tasks, their application and exploration in SE-based downstream tasks are limited. Here, we study the knowledge transfer using adapters on multiple down-stream tasks including cloze test, code clone detection, and code summarization. These adapters are trained on code corpora and are inserted into a PLM that is pre-trained on English corpora or code corpora. We called these PLMs as NL-PLM and C-PLM, respectively. We observed an improvement in results using NL-PLM over a PLM that does not have adapters, and this suggested that adapters can transfer and utilize useful knowledge from NL-PLM to SE tasks. The results are sometimes on par with or exceed the results of C-PLM; while being more efficient in terms of the number of parameters and training time. Interestingly, adapters inserted into a C-PLM generally yield better results than a traditional fine-tuned C-PLM. Our results open new directions to build more compact models for SE tasks.
翻译:摘要:软件工程(SE)领域的预训练语言模型(PLMs),例如CodeBERT,是在大规模代码语料库上预训练的,其习得的知识已通过PLMs微调成功迁移至下游任务(例如代码克隆检测)。在自然语言处理(NLP)领域,一种替代的PLM知识迁移方法通过使用适配器(adapter)进行探索——这是一种紧凑且参数高效的模块,可插入PLM中。尽管适配器在许多基于NLP的下游任务中展现出令人瞩目的效果,但其在基于SE的下游任务中的应用与探索仍十分有限。本文研究了利用适配器在多个下游任务(包括完形填空测试、代码克隆检测和代码摘要)上的知识迁移。这些适配器在代码语料库上训练,并被插入到预训练于英语语料库或代码语料库的PLM中,我们分别将其称为NL-PLM和C-PLM。实验发现,使用含适配器的NL-PLM相比不含适配器的PLM结果有所提升,表明适配器能够从NL-PLM迁移并利用有用知识到SE任务中。其结果有时可与C-PLM的结果持平甚至超越,同时在参数数量和训练时间方面更具效率。值得注意的是,插入到C-PLM中的适配器通常比传统微调的C-PLM获得更优结果。我们的研究结果为构建更紧凑的SE任务模型开辟了新方向。