This paper presents our findings on the automatic summarization of Java methods within Ericsson, a global telecommunications company. We evaluate the performance of an approach called Automatic Semantic Augmentation of Prompts (ASAP), which uses a Large Language Model (LLM) to generate leading summary comments for Java methods. ASAP enhances the $LLM's$ prompt context by integrating static program analysis and information retrieval techniques to identify similar exemplar methods along with their developer-written Javadocs, and serves as the baseline in our study. In contrast, we explore and compare the performance of four simpler approaches that do not require static program analysis, information retrieval, or the presence of exemplars as in the ASAP method. Our methods rely solely on the Java method body as input, making them lightweight and more suitable for rapid deployment in commercial software development environments. We conducted experiments on an Ericsson software project and replicated the study using two widely-used open-source Java projects, Guava and Elasticsearch, to ensure the reliability of our results. Performance was measured across eight metrics that capture various aspects of similarity. Notably, one of our simpler approaches performed as well as or better than the ASAP method on both the Ericsson project and the open-source projects. Additionally, we performed an ablation study to examine the impact of method names on Javadoc summary generation across our four proposed approaches and the ASAP method. By masking the method names and observing the generated summaries, we found that our approaches were statistically significantly less influenced by the absence of method names compared to the baseline. This suggests that our methods are more robust to variations in method names and may derive summaries more comprehensively from the method body than the ASAP approach.
翻译:本文介绍了我们在全球电信公司 Ericsson 内部对 Java 方法进行自动摘要的研究发现。我们评估了一种名为“提示自动语义增强”(Automatic Semantic Augmentation of Prompts, ASAP)方法的性能,该方法利用大型语言模型(Large Language Model, LLM)为 Java 方法生成前置摘要注释。ASAP 通过集成静态程序分析和信息检索技术来识别相似的示例方法及其开发者编写的 Javadoc 文档,从而增强 $LLM$ 的提示上下文,并作为我们研究的基线。相比之下,我们探索并比较了四种更简单方法的性能,这些方法无需像 ASAP 方法那样进行静态程序分析、信息检索或依赖示例的存在。我们的方法仅以 Java 方法体作为输入,使其更为轻量,更适合在商业软件开发环境中快速部署。我们在一个 Ericsson 软件项目上进行了实验,并使用两个广泛使用的开源 Java 项目(Guava 和 Elasticsearch)复现了研究,以确保结果的可靠性。性能通过八个捕捉不同相似性维度的指标进行衡量。值得注意的是,我们的一种更简单方法在 Ericsson 项目和开源项目上的表现均与 ASAP 方法相当或更优。此外,我们进行了一项消融研究,以检验方法名称对我们提出的四种方法以及 ASAP 方法在生成 Javadoc 摘要时的影响。通过遮蔽方法名称并观察生成的摘要,我们发现,与基线方法相比,我们的方法在统计上显著地更少受到方法名称缺失的影响。这表明我们的方法对方法名称的变化更具鲁棒性,并且可能比 ASAP 方法更全面地从方法体中推导出摘要。