Code comment generation aims at generating natural language descriptions for a code snippet to facilitate developers' program comprehension activities. Despite being studied for a long time, a bottleneck for existing approaches is that given a code snippet, they can only generate one comment while developers usually need to know information from diverse perspectives such as what is the functionality of this code snippet and how to use it. To tackle this limitation, this study empirically investigates the feasibility of utilizing large language models (LLMs) to generate comments that can fulfill developers' diverse intents. Our intuition is based on the facts that (1) the code and its pairwise comment are used during the pre-training process of LLMs to build the semantic connection between the natural language and programming language, and (2) comments in the real-world projects, which are collected for the pre-training, usually contain different developers' intents. We thus postulate that the LLMs can already understand the code from different perspectives after the pre-training. Indeed, experiments on two large-scale datasets demonstrate the rationale of our insights: by adopting the in-context learning paradigm and giving adequate prompts to the LLM (e.g., providing it with ten or more examples), the LLM can significantly outperform a state-of-the-art supervised learning approach on generating comments with multiple intents. Results also show that customized strategies for constructing the prompts and post-processing strategies for reranking the results can both boost the LLM's performances, which shed light on future research directions for using LLMs to achieve comment generation.
翻译:代码注释生成旨在为代码片段生成自然语言描述,以促进开发人员的程序理解活动。尽管已有长期研究,现有方法的一个瓶颈在于:给定一个代码片段时,它们只能生成一条注释,而开发人员通常需要从不同视角获取信息,例如该代码片段的功能是什么以及如何使用。为解决这一局限,本研究通过实证方法探究利用大型语言模型(LLMs)生成满足开发人员多样化意图注释的可行性。我们的直觉基于以下事实:(1)LLM预训练过程中使用了代码及其配对注释,以建立自然语言与编程语言之间的语义关联;(2)用于预训练的真实项目注释通常包含开发人员的不同意图。因此我们推断,LLM在预训练后已能从不同角度理解代码。事实上,在两个大规模数据集上的实验验证了我们的洞见合理性:采用上下文学习范式并向LLM提供充分提示(例如提供十个或更多示例)时,LLM在生成多意图注释方面显著优于最先进的监督学习方法。结果还表明,定制化的提示构建策略与重排序结果的后处理策略均能提升LLM性能,这为利用LLM实现注释生成的未来研究方向提供了启示。