Code comment generation aims at generating natural language descriptions for a code snippet to facilitate developers' program comprehension activities. Despite being studied for a long time, a bottleneck for existing approaches is that given a code snippet, they can only generate one comment while developers usually need to know information from diverse perspectives such as what is the functionality of this code snippet and how to use it. To tackle this limitation, this study empirically investigates the feasibility of utilizing large language models (LLMs) to generate comments that can fulfill developers' diverse intents. Our intuition is based on the facts that (1) the code and its pairwise comment are used during the pre-training process of LLMs to build the semantic connection between the natural language and programming language, and (2) comments in the real-world projects, which are collected for the pre-training, usually contain different developers' intents. We thus postulate that the LLMs can already understand the code from different perspectives after the pre-training. Indeed, experiments on two large-scale datasets demonstrate the rationale of our insights: by adopting the in-context learning paradigm and giving adequate prompts to the LLM (e.g., providing it with ten or more examples), the LLM can significantly outperform a state-of-the-art supervised learning approach on generating comments with multiple intents. Results also show that customized strategies for constructing the prompts and post-processing strategies for reranking the results can both boost the LLM's performances, which shed light on future research directions for using LLMs to achieve comment generation.
翻译:代码注释生成旨在为代码片段生成自然语言描述,以促进开发者的程序理解活动。尽管该方向已研究多年,现有方法存在一个瓶颈:给定一段代码时,它们只能生成一条注释,而开发者通常需要从不同角度获取信息,例如该代码片段的功能及使用方法。为解决这一局限,本研究通过实证方法探索利用大型语言模型(LLM)生成满足开发者多样化意图的注释的可行性。我们的直觉基于两个事实:(1)LLM的预训练过程使用代码及其配对注释来建立自然语言与编程语言之间的语义关联;(2)预训练收集的真实项目注释通常包含开发者的不同意图。因此我们假设,经过预训练的LLM已能从多角度理解代码。在两大规模数据集上的实验证实了这一见解的合理性:通过采用上下文学习范式并向LLM提供充分提示(例如提供十个及以上示例),LLM在生成多意图注释方面的表现显著优于当前最先进的监督学习方法。结果还表明,定制化的提示构建策略与重排序结果的后处理策略均能提升LLM的性能,这为利用LLM实现注释生成的未来研究方向提供了启示。