Software documentation supports a broad set of software maintenance tasks; however, creating and maintaining high-quality, multi-level software documentation can be incredibly time-consuming and therefore many code bases suffer from a lack of adequate documentation. We address this problem through presenting HGEN, a fully automated pipeline that leverages LLMs to transform source code through a series of six stages into a well-organized hierarchy of formatted documents. We evaluate HGEN both quantitatively and qualitatively. First, we use it to generate documentation for three diverse projects, and engage key developers in comparing the quality of the generated documentation against their own previously produced manually-crafted documentation. We then pilot HGEN in nine different industrial projects using diverse datasets provided by each project. We collect feedback from project stakeholders, and analyze it using an inductive approach to identify recurring themes. Results show that HGEN produces artifact hierarchies similar in quality to manually constructed documentation, with much higher coverage of the core concepts than the baseline approach. Stakeholder feedback highlights HGEN's commercial impact potential as a tool for accelerating code comprehension and maintenance tasks. Results and associated supplemental materials can be found at https://zenodo.org/records/11403244
翻译:软件文档支持广泛的软件维护任务;然而,创建和维护高质量、多层次的软件文档可能极其耗时,因此许多代码库因缺乏足够的文档而受到影响。我们通过提出HGEN来解决这个问题,这是一个完全自动化的流程,利用大型语言模型(LLMs)通过六个阶段将源代码转化为组织良好的格式化文档层次结构。我们从定量和定性两方面对HGEN进行了评估。首先,我们使用它为三个不同的项目生成文档,并邀请关键开发人员将生成的文档质量与他们先前手工制作的文档进行比较。随后,我们在九个不同的工业项目中试点应用HGEN,每个项目均使用其提供的多样化数据集。我们收集了项目利益相关者的反馈,并使用归纳方法对其进行分析,以识别反复出现的主题。结果表明,HGEN生成的制品层次结构在质量上与手动构建的文档相似,并且比基线方法对核心概念的覆盖度更高。利益相关者的反馈强调了HGEN作为加速代码理解和维护任务的工具所具有的商业影响潜力。结果及相关补充材料可在 https://zenodo.org/records/11403244 找到。