Usually, programming languages have official documentation to guide developers with APIs, methods, and classes. However, researchers identified insufficient or inadequate documentation examples and flaws with the API's complex structure as barriers to learning an API. As a result, developers may consult other sources (StackOverflow, GitHub, etc.) to learn more about an API. Recent research studies have shown that unofficial documentation is a valuable source of information for generating code summaries. We, therefore, have been motivated to leverage such a type of documentation along with deep learning techniques towards generating high-quality summaries for APIs discussed in informal documentation. This paper proposes an automatic approach using the BART algorithm, a state-of-the-art transformer model, to generate summaries for APIs discussed in StackOverflow. We built an oracle of human-generated summaries to evaluate our approach against it using ROUGE and BLEU metrics which are the most widely used evaluation metrics in text summarization. Furthermore, we evaluated our summaries empirically against a previous work in terms of quality. Our findings demonstrate that using deep learning algorithms can improve summaries' quality and outperform the previous work by an average of %57 for Precision, %66 for Recall, and %61 for F-measure, and it runs 4.4 times faster.
翻译:通常,编程语言会提供官方文档,以指导开发者使用API、方法和类。然而,研究人员发现,示例不充分或不恰当、API复杂结构存在缺陷,是学习API的主要障碍。因此,开发者可能会查阅其他来源(如StackOverflow、GitHub等)以深入了解API。近期研究表明,非官方文档是生成代码摘要的宝贵信息来源。受此启发,我们利用这类文档结合深度学习技术,为非正式文档中讨论的API生成高质量摘要。本文提出一种基于BART算法(一种先进transformer模型)的自动化方法,用于生成StackOverflow中讨论的API摘要。我们构建了一个人工摘要参考标准,并使用文本摘要领域最广泛使用的评估指标ROUGE和BLEU对其进行了评估。此外,我们还将生成的摘要与先前工作在质量上进行了实证对比。结果表明,使用深度学习算法可提升摘要质量,在精确率、召回率和F值上分别平均超过先前工作57%、66%和61%,且运行速度快4.4倍。