Utilization of Pre-trained Language Model for Adapter-based Knowledge Transfer in Software Engineering

Software Engineering (SE) Pre-trained Language Models (PLMs), such as CodeBERT, are pre-trained on large code corpora, and their learned knowledge has shown success in transferring into downstream tasks (e.g., code clone detection) through fine-tuning the PLMs. In Natural Language Processing (NLP), an alternative in transferring the knowledge of PLMs is explored through the use of adapter, a compact and parameter efficient module that is inserted into a PLM. Although the use of adapters has shown promising results in many NLP-based downstream tasks, their application and exploration in SE-based downstream tasks are limited. Here, we study the knowledge transfer using adapters on multiple downstream tasks including cloze test, code clone detection, and code summarization. These adapters are trained on code corpora and are inserted into a PLM that is pre-trained on English corpora or code corpora. We called these PLMs as NL-PLM and C-PLM, respectively. We observed an improvement in results using NL-PLM over a PLM that does not have adapters, and this suggested that adapters can transfer and utilize useful knowledge from NL-PLM to SE tasks. The results are sometimes on par with or exceed the results of C-PLM; while being more efficient in terms of the number of parameters and training time. Interestingly, adapters inserted into a C-PLM generally yield better results than a traditional fine-tuned C-PLM. Our results open new directions to build more compact models for SE tasks.

翻译：软件工程预训练语言模型（如CodeBERT）通过在大规模代码语料库上预训练获得的知识，可通过微调方式成功迁移至下游任务（如代码克隆检测）。在自然语言处理领域，研究人员探索了另一种迁移预训练语言模型知识的方法——使用适配器，这是一种插入预训练语言模型的紧凑且参数高效的模块。尽管适配器在多项基于自然语言处理的下游任务中展现出良好效果，但其在软件工程下游任务中的应用和探索仍十分有限。本研究针对多项下游任务（包括完形填空、代码克隆检测和代码摘要生成）研究了基于适配器的知识迁移方法。这些适配器在代码语料库上训练，并插入至预训练于英文语料库或代码语料库的预训练语言模型中，我们将后者分别称为自然语言预训练语言模型和代码预训练语言模型。实验发现，使用自然语言预训练语言模型（相较于无适配器的预训练语言模型）能获得性能提升，这表明适配器能够从自然语言预训练语言模型中迁移并利用有效知识至软件工程任务。其结果有时与代码预训练语言模型的性能相当甚至更优，同时在参数规模和训练时间上更具效率。值得注意的是，将适配器插入代码预训练语言模型通常能获得优于传统微调代码预训练语言模型的效果。本研究结果为构建更轻量化的软件工程任务模型开辟了新方向。