Genetic Auto-prompt Learning for Pre-trained Code Intelligence Language Models

As Pre-trained Language Models (PLMs), a popular approach for code intelligence, continue to grow in size, the computational cost of their usage has become prohibitively expensive. Prompt learning, a recent development in the field of natural language processing, emerges as a potential solution to address this challenge. In this paper, we investigate the effectiveness of prompt learning in code intelligence tasks. We unveil its reliance on manually designed prompts, which often require significant human effort and expertise. Moreover, we discover existing automatic prompt design methods are very limited to code intelligence tasks due to factors including gradient dependence, high computational demands, and limited applicability. To effectively address both issues, we propose Genetic Auto Prompt (GenAP), which utilizes an elaborate genetic algorithm to automatically design prompts. With GenAP, non-experts can effortlessly generate superior prompts compared to meticulously manual-designed ones. GenAP operates without the need for gradients or additional computational costs, rendering it gradient-free and cost-effective. Moreover, GenAP supports both understanding and generation types of code intelligence tasks, exhibiting great applicability. We conduct GenAP on three popular code intelligence PLMs with three canonical code intelligence tasks including defect prediction, code summarization, and code translation. The results suggest that GenAP can effectively automate the process of designing prompts. Specifically, GenAP outperforms all other methods across all three tasks (e.g., improving accuracy by an average of 2.13% for defect prediction). To the best of our knowledge, GenAP is the first work to automatically design prompts for code intelligence PLMs.

翻译：随着预训练语言模型（PLMs）作为代码智能的主流方法不断增大规模，其使用成本已变得过高。提示学习作为自然语言处理领域的最新进展，为解决这一挑战提供了潜在方案。本文探究了提示学习在代码智能任务中的有效性。我们揭示了其依赖人工设计提示的特性，这往往需要大量人力与专业知识。此外，我们发现现有自动提示设计方法因梯度依赖、高计算需求及适用性有限等因素，在代码智能任务中受到极大限制。为有效解决这两个问题，我们提出遗传自动提示（GenAP）方法，通过精心设计的遗传算法自动生成提示。借助GenAP，非专家用户也能轻松生成优于人工精细设计的提示。GenAP无需梯度且不增加额外计算成本，具备无梯度与高性价比特性。此外，GenAP支持代码智能中的理解型与生成型两类任务，展现出卓越的适用性。我们选取三个主流代码智能PLMs，在缺陷预测、代码摘要和代码翻译三个经典任务上验证GenAP。结果表明，GenAP能有效自动化提示设计流程。具体而言，GenAP在所有三个任务上均优于其他方法（例如，在缺陷预测任务上平均提升2.13%的准确率）。据我们所知，GenAP是首个为代码智能PLMs自动设计提示的工作。