Parameter Efficient Fine-Tuning (PEFT) is an alternate choice to full fine-tuning a language model. Though PEFT methods are used in natural language domain widely, there are limited studies on using PEFT for language models that are pre-trained on code and comment datasets (i.e., code-LMs). Previous research has also shown that code summarization, a task that intends to generate natural description of the given code snippet automatically and is known to benefit the program comprehension, benefits from multilingual fine-tuning approach. In multilingual fine-tuning, the code-LM is fine-tuned on a dataset consisting of different programming languages. AdapterFusion is a specific PEFT approach that aims to extract and compose the latent knowledge from multiple (language) adapters for a downstream task. However, our experiments reveal that the AdapterFusion still learns from the same language, not taking advantage of other programming languages. Therefore, we change the architecture and propose AdvFusion, a PEFT approach that enforces the model to first learn from other programming languages, and then pay attention to the language of the target task. Therefore, the AdvFusion emphasizes the knowledge transfer among different programming languages, as stated in the multilingual fine-tuning. Our results on the CodeSearchNet dataset using two code-LMs, show that Adapters, AdapterFusion, and our proposed AdvFusion can achieve results on-par with or higher than the full fine-tuning models for code summarization and method name prediction. Notably, the number of trainable parameters are 123x less and the training time is reduced by ~30%. AdvFusion exhibits a notable enhancement compared to AdapterFusion, showcasing a 0.9 to 1.7-point increase in BLEU-4 scores specifically for Ruby, JavaScript, and Go.
翻译:摘要:参数高效微调(PEFT)是替代语言模型全参数微调的一种选择。尽管PEFT方法在自然语言领域得到广泛应用,但针对代码和注释数据集预训练的语言模型(即代码LM)的PEFT研究仍十分有限。先前研究也表明,代码摘要——一种旨在自动生成给定代码片段自然描述并有助于程序理解的任务——受益于多语言微调方法。多语言微调中,代码LM在不同编程语言组成的数据集上进行微调。AdapterFusion是一种特定的PEFT方法,旨在从多个(语言)适配器中提取并组合潜在知识以用于下游任务。然而,我们的实验揭示,AdapterFusion仍从同一语言中学习,未能利用其他编程语言的优势。因此,我们修改架构并提出AdvFusion——一种PEFT方法,强制模型首先从其他编程语言学习,然后关注目标任务的语言。因此,AdvFusion强调不同编程语言之间的知识迁移,正如多语言微调所倡导的。我们在CodeSearchNet数据集上使用两个代码LM的实验结果表明,适配器、AdapterFusion以及我们提出的AdvFusion在代码摘要和方法名称预测任务上,均能达到与全参数微调模型相当甚至更高的性能。值得注意的是,可训练参数数量减少123倍,训练时间缩短约30%。与AdapterFusion相比,AdvFusion展现出显著提升,具体体现在Ruby、JavaScript和Go语言的BLEU-4分数提高0.9至1.7分。