Transducer Tuning: Efficient Model Adaptation for Software Tasks Using Code Property Graphs

Large language models have demonstrated promising performance across various software engineering tasks. While fine-tuning is a common practice to adapt these models for downstream tasks, it becomes challenging in resource-constrained environments due to increased memory requirements from growing trainable parameters in increasingly large language models. We introduce \approach, a technique to adapt large models for downstream code tasks using Code Property Graphs (CPGs). Our approach introduces a modular component called \transducer that enriches code embeddings with structural and dependency information from CPGs. The Transducer comprises two key components: Graph Vectorization Engine (GVE) and Attention-Based Fusion Layer (ABFL). GVE extracts CPGs from input source code and transforms them into graph feature vectors. ABFL then fuses those graphs feature vectors with initial code embeddings from a large language model. By optimizing these transducers for different downstream tasks, our approach enhances the models without the need to fine-tune them for specific tasks. We have evaluated \approach on three downstream tasks: code summarization, assert generation, and code translation. Our results demonstrate competitive performance compared to full parameter fine-tuning while reducing up to 99\% trainable parameters to save memory. \approach also remains competitive against other fine-tuning approaches (e.g., LoRA, Prompt-Tuning, Prefix-Tuning) while using only 1.5\%-80\% of their trainable parameters. Our findings show that integrating structural and dependency information through Transducer Tuning enables more efficient model adaptation, making it easier for users to adapt large models in resource-constrained settings.

翻译：大语言模型在各类软件工程任务中展现出优异的性能。尽管微调是使这些模型适应下游任务的常用方法，但随着大语言模型可训练参数量的持续增长，其内存需求不断增加，在资源受限环境中实施微调变得愈发困难。本文提出一种基于代码属性图（CPG）的大模型代码任务适配技术。该方法引入名为转换器的模块化组件，利用CPG中的结构与依赖信息增强代码嵌入表示。转换器包含两个核心组件：图向量化引擎（GVE）与基于注意力的融合层（ABFL）。GVE从输入源代码中提取CPG并将其转换为图特征向量，ABFL随后将这些图特征向量与大语言模型生成的初始代码嵌入进行融合。通过对不同下游任务优化转换器，本方法无需针对特定任务微调模型即可实现性能提升。我们在代码摘要、断言生成和代码翻译三个下游任务上评估了该方法的性能。实验结果表明，在减少高达99%可训练参数以节省内存的同时，本方法取得了与全参数微调相当的性能。相较于其他微调方法（如LoRA、提示调优、前缀调优），本方法仅需其1.5%-80%的可训练参数量即可保持竞争力。研究结果表明，通过转换器调优整合结构与依赖信息，能够实现更高效的模型适配，帮助用户在资源受限环境中更便捷地应用大语言模型。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日