The increasing complexity of modern software systems has made understanding their behavior increasingly challenging, driving the need for explainability to improve transparency and user trust. Traditional documentation is often outdated or incomplete, making it difficult to derive accurate, context-specific explanations. Meanwhile, issue-tracking systems capture rich and continuously updated development knowledge, but their potential for explainability remains untapped. With this work, we are the first to apply a Retrieval-Augmented Generation (RAG) approach for generating explanations from issue-tracking data. Our proof-of-concept system is implemented using open-source tools and language models, demonstrating the feasibility of leveraging structured issue data for explanation generation. Evaluating our approach on an exemplary project's set of GitHub issues, we achieve 90% alignment with human-written explanations. Additionally, our system exhibits strong faithfulness and instruction adherence, ensuring reliable and grounded explanations. These findings suggest that RAG-based methods can extend explainability beyond black-box ML models to a broader range of software systems, provided that issue-tracking data is available - making system behavior more accessible and interpretable.
翻译:现代软件系统日益增长的复杂性使得理解其行为变得愈发困难,这推动了对可解释性的需求以提升透明度和用户信任。传统文档往往过时或不完整,难以从中获取准确且符合具体上下文的解释。与此同时,问题跟踪系统捕获了丰富且持续更新的开发知识,但其在可解释性方面的潜力尚未得到开发。在本工作中,我们首次应用检索增强生成(RAG)方法从问题跟踪数据中生成解释。我们的概念验证系统使用开源工具和语言模型实现,证明了利用结构化问题数据进行解释生成的可行性。通过在示例项目的GitHub问题集上评估我们的方法,我们实现了与人工撰写解释90%的一致性。此外,我们的系统展现出高度的忠实性和指令遵循性,确保了解释的可靠性和事实依据。这些发现表明,只要问题跟踪数据可用,基于RAG的方法可以将可解释性从黑盒机器学习模型扩展到更广泛的软件系统领域,从而使系统行为更易于理解和解释。