ALPINE: An adaptive language-agnostic pruning method for language models for code

Language models of code have demonstrated state-of-the-art performance across various software engineering and source code analysis tasks. However, their demanding computational resource requirements and consequential environmental footprint remain as significant challenges. This work introduces ALPINE, an adaptive programming language-agnostic pruning technique designed to substantially reduce these models' computational overhead. The proposed method offers a pluggable layer that can be integrated with all Transformer-based models. With ALPINE, input sequences undergo adaptive compression throughout the pipeline, reaching a size up to $\times 3$ less their initial size, resulting in significantly reduced computational load. Our experiments on two software engineering tasks, defect prediction and code clone detection across three language models CodeBERT, GraphCodeBERT and UniXCoder show that ALPINE achieves up to a 50% reduction in FLOPs, a 58.1% decrease in memory footprint, and a 28.1% improvement in throughput on average. This led to a reduction in CO2 by up to $44.85$%. Importantly, it achieves the reduction in computation resources while maintaining up to 98.1% of the original predictive performance. These findings highlight the potential of ALPINE in making language models of code more resource-efficient and accessible while preserving their performance, contributing to the overall sustainability of adopting language models in software development. Also, it sheds light on redundant and noisy information in source code analysis corpora, as shown by the substantial sequence compression achieved by ALPINE.

翻译：代码语言模型在各种软件工程和源代码分析任务中展现出最先进的性能。然而，其高昂的计算资源需求和随之而来的环境足迹仍然是重大挑战。本文提出了ALPINE，一种自适应、编程语言无关的剪枝技术，旨在显著降低这些模型的计算开销。该方法提供了一个可插拔层，可与所有基于Transformer的模型集成。通过ALPINE，输入序列在整个处理流程中经历自适应压缩，尺寸最多可缩减至初始大小的$\times 3$，从而显著降低计算负载。我们在缺陷预测和代码克隆检测两项软件工程任务上，针对CodeBERT、GraphCodeBERT和UniXCoder三种语言模型进行了实验。结果表明，ALPINE平均可实现高达50%的浮点运算量（FLOPs）减少、58.1%的内存占用降低以及28.1%的吞吐量提升，同时使二氧化碳排放量最多减少$44.85$%。重要的是，这些计算资源的减少是在保持高达98.1%原始预测性能的前提下实现的。这些发现凸显了ALPINE在保持性能的同时，使代码语言模型更具资源效率和可访问性的潜力，有助于提升在软件开发中采用语言模型的整体可持续性。此外，ALPINE实现的显著序列压缩也揭示了源代码分析语料库中存在的冗余与噪声信息。