Usually, programming languages have official documentation to guide developers with APIs, methods, and classes. However, researchers identified insufficient or inadequate documentation examples and flaws with the API's complex structure as barriers to learning an API. As a result, developers may consult other sources (StackOverflow, GitHub, etc.) to learn more about an API. Recent research studies have shown that unofficial documentation is a valuable source of information for generating code summaries. We, therefore, have been motivated to leverage such a type of documentation along with deep learning techniques towards generating high-quality summaries for APIs discussed in informal documentation. This paper proposes an automatic approach using the BART algorithm, a state-of-the-art transformer model, to generate summaries for APIs discussed in StackOverflow. We built an oracle of human-generated summaries to evaluate our approach against it using ROUGE and BLEU metrics which are the most widely used evaluation metrics in text summarization. Furthermore, we evaluated our summaries empirically against a previous work in terms of quality. Our findings demonstrate that using deep learning algorithms can improve summaries' quality and outperform the previous work by an average of %57 for Precision, %66 for Recall, and %61 for F-measure, and it runs 4.4 times faster.
翻译:通常,编程语言都有官方文档来指导开发者使用API、方法和类。然而,研究者发现文档中示例不足或不当、以及API结构的复杂性是学习API的主要障碍。因此,开发者可能会查阅其他来源(如StackOverflow、GitHub等)来更深入地了解API。近期研究表明,非官方文档是生成代码摘要的重要信息来源。这促使我们利用此类文档结合深度学习技术,为在非正式文档中讨论的API生成高质量摘要。本文提出一种基于BART算法(一种先进的Transformer模型)的自动化方法,用于为StackOverflow中讨论的API生成摘要。我们构建了人工生成摘要的基准集,并采用文本摘要领域最广泛使用的ROUGE和BLEU指标进行评估。此外,我们从质量角度将生成的摘要与先前工作进行了实证对比。研究结果表明,使用深度学习算法能显著提升摘要质量:相较于先前工作,精确率平均提升57%,召回率提升66%,F值提升61%,且运行速度快4.4倍。