Writing Better Software Explanations: A Guideline-Based Approach

As software systems increasingly rely on natural-language explanations to address user-reported explanation needs in requirements communication and support, ensuring that such explanations are consistent, relevant, and well formulated remains a major challenge. Purely automatic large language model (LLM) generation often lacks reliable grounding and controllable output quality. In this paper, we present a guideline-based formulation support tool for software explanations that combines LLM-assisted text generation with an empirically derived quality guideline. The tool structures the writing process into generation, quality checking, and iterative revision, while keeping domain control with developers. We evaluated the approach in a two-phase study consisting of an interview-based developer experiment and a controlled user survey. Six industry practitioners with software development or DevOps experience formulated explanations for real explanation needs in a human-only manual condition and in a human-with-LLM-support condition. In this small-scale evaluation, tool-supported formulation was on average 24.4% faster, although inferential analyses indicated only a trend for efficiency. In a subsequent user study with 17 participants and 204 paired comparisons, tool-supported explanations were rated significantly higher in overall satisfaction than manual explanations (p=0.003, rank-biserial correlation=0.86). Our findings suggest potential efficiency gains and higher perceived formulation quality through guideline-driven LLM assistance. Future work should examine long-term industrial use and integration into existing development workflows.

翻译：随着软件系统日益依赖自然语言解释来满足需求交流与支持中用户报告的解释需求，确保这些解释的一致性、相关性和良好表述仍是一项重大挑战。纯粹自动化的大语言模型（LLM）生成往往缺乏可靠的依据和可控的输出质量。本文提出了一种基于指南的软件解释表述支持工具，该工具将LLM辅助文本生成与经验推导的质量指南相结合。该工具将写作过程结构化分为生成、质量检查和迭代修订三个阶段，同时将领域控制权保留给开发者。我们通过两阶段研究评估了该方法，包括基于访谈的开发者实验和受控用户调查。六名具有软件开发或DevOps经验的行业从业者分别采用纯人工手动方式和人工加LLM支持方式为真实解释需求撰写解释。在这项小规模评估中，工具支持的表述平均速度快24.4%，尽管推论分析仅显示出效率趋势。在后续的17名参与者、204组成对比较的用户研究中，工具支持的解释在总体满意度上显著高于手动解释（p=0.003，秩双列相关系数=0.86）。我们的研究结果表明，通过指南驱动的LLM辅助可带来潜在效率提升和更高的感知表述质量。未来工作应关注长期工业应用及与现有开发工作流的集成。