A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Software Engineering Tasks

Pre-trained models (PTMs) have achieved great success in various Software Engineering (SE) downstream tasks following the ``pre-train then fine-tune'' paradigm. As fully fine-tuning all parameters of PTMs can be computationally expensive, a widely used solution is parameter-efficient fine-tuning (PEFT), which freezes PTMs while introducing extra parameters. Though work has been done to test PEFT methods in the SE field, a comprehensive evaluation is still lacking. This paper aims to fill in this gap by evaluating the effectiveness of five PEFT methods on eight PTMs and four SE downstream tasks. For different tasks and PEFT methods, we seek answers to the following research questions: 1) Is it more effective to use PTMs trained specifically on source code, or is it sufficient to use PTMs trained on natural language text? 2) What is the impact of varying model sizes? 3) How does the model architecture affect the performance? Besides effectiveness, we also discuss the efficiency of PEFT methods, concerning the costs of required training time and GPU resource consumption. We hope that our findings can provide a deeper understanding of PEFT methods on various PTMs and SE downstream tasks. All the codes and data are available at \url{https://github.com/zwtnju/PEFT.git}.

翻译：预训练模型（PTMs）在遵循“预训练-微调”范式的各类软件工程（SE）下游任务中取得了巨大成功。由于完全微调PTMs的所有参数计算成本高昂，一种广泛使用的解决方案是参数高效微调（PEFT），该方法冻结PTMs的同时引入额外参数。尽管已有研究尝试在SE领域测试PEFT方法，但仍缺乏系统性的综合评估。本文旨在填补这一空白，评估五种PEFT方法在八种PTMs和四个SE下游任务中的有效性。针对不同任务与PEFT方法，我们寻求以下研究问题的答案：1）使用专门针对源代码训练的PTMs是否更有效，还是使用自然语言文本训练的PTMs就足够？2）不同模型规模带来的影响是什么？3）模型架构如何影响性能？除有效性外，我们还从所需训练时间和GPU资源消耗成本角度探讨了PEFT方法的效率。我们期望研究成果能深化对PEFT方法在各类PTMs和SE下游任务中的应用理解。所有代码与数据均可在\url{https://github.com/zwtnju/PEFT.git}获取。

相关内容

Engineering

关注 7

《工程》是中国工程院（CAE）于2015年推出的国际开放存取期刊。其目的是提供一个高水平的平台，传播和分享工程研发的前沿进展、当前主要研究成果和关键成果；报告工程科学的进展，讨论工程发展的热点、兴趣领域、挑战和前景，在工程中考虑人与环境的福祉和伦理道德，鼓励具有深远经济和社会意义的工程突破和创新，使之达到国际先进水平，成为新的生产力，从而改变世界，造福人类，创造新的未来。期刊链接：https://www.sciencedirect.com/journal/engineering

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日