Commenting code is a crucial activity in software development, as it aids in facilitating future maintenance and updates. To enhance the efficiency of writing comments and reduce developers' workload, researchers has proposed various automated code summarization (ACS) techniques to automatically generate comments/summaries for given code units. However, these ACS techniques primarily focus on generating summaries for code units at the method level. There is a significant lack of research on summarizing higher-level code units, such as file-level and module-level code units, despite the fact that summaries of these higher-level code units are highly useful for quickly gaining a macro-level understanding of software components and architecture. To fill this gap, in this paper, we conduct a systematic study on how to use LLMs for commenting higher-level code units, including file level and module level. These higher-level units are significantly larger than method-level ones, which poses challenges in handling long code inputs within LLM constraints and maintaining efficiency. To address these issues, we explore various summarization strategies for ACS of higher-level code units, which can be divided into three types: full code summarization, reduced code summarization, and hierarchical code summarization. The experimental results suggest that for summarizing file-level code units, using the full code is the most effective approach, with reduced code serving as a cost-efficient alternative. However, for summarizing module-level code units, hierarchical code summarization becomes the most promising strategy. In addition, inspired by the research on method-level ACS, we also investigate using the LLM as an evaluator to evaluate the quality of summaries of higher-level code units. The experimental results demonstrate that the LLM's evaluation results strongly correlate with human evaluations.
翻译:代码注释是软件开发中的关键活动,有助于促进未来的维护与更新。为提高注释撰写效率并减轻开发者负担,研究者提出了多种自动化代码摘要(ACS)技术,旨在为给定代码单元自动生成注释/摘要。然而,现有ACS技术主要聚焦于方法级代码单元的摘要生成。尽管文件级和模块级等高层次代码单元的摘要对于快速把握软件组件与架构的宏观理解极具价值,但针对此类高层级代码单元的摘要研究仍显著不足。为填补这一空白,本文系统研究了如何利用大语言模型(LLM)对文件级和模块级等高层次代码单元进行注释生成。这些高层级单元的代码规模远大于方法级单元,在处理长代码输入时面临LLM约束下的技术挑战与效率维持问题。为解决这些问题,我们探索了适用于高层级代码单元ACS的多种摘要生成策略,将其归纳为三类:完整代码摘要、精简代码摘要和层次化代码摘要。实验结果表明:对于文件级代码单元,采用完整代码进行摘要生成是最有效的策略,而精简代码可作为成本效益较高的替代方案;对于模块级代码单元,层次化代码摘要则成为最具前景的策略。此外,受方法级ACS研究的启发,我们还探索了将LLM作为评估器来评价高层级代码单元摘要质量的方法。实验证明,LLM的评估结果与人工评估具有高度相关性。