Software architecture is inherently knowledge-centric. The architectural knowledge is distributed across heterogeneous software artifacts such as requirements documents, design diagrams, code, and documentation, making it difficult for developers to access and utilize this knowledge effectively. Moreover, as systems evolve, inconsistencies frequently emerge between these artifacts, leading to architectural erosion and impeding maintenance activities. We envision an automated pipeline that systematically extracts architectural knowledge from diverse artifacts, links them, identifies and resolves inconsistencies, and consolidates this knowledge into a structured knowledge base. This knowledge base enables critical activities such as architecture conformance checking and change impact analysis, while supporting natural language question-answering to improve access to architectural knowledge. To realize this vision, we plan to develop specialized extractors for different artifact types, design a unified knowledge representation schema, implement consistency checking mechanisms, and integrate retrieval-augmented generation techniques for conversational knowledge access.
翻译:软件架构本质上是知识密集型的。架构知识分散在需求文档、设计图、代码和文档等异构软件制品中,使得开发者难以有效获取和利用这些知识。此外,随着系统演化,这些制品之间经常出现不一致,导致架构侵蚀并阻碍维护活动。我们设想一种自动化流水线,能够从多样化的制品中系统性地提取架构知识,将它们关联起来,识别并解决不一致性,并将这些知识整合到一个结构化的知识库中。该知识库支持架构一致性检查和变更影响分析等关键活动,同时通过自然语言问答功能改善对架构知识的访问。为实现这一愿景,我们计划为不同制品类型开发专用提取器,设计统一的知识表示模式,实现一致性检查机制,并集成检索增强生成技术以支持对话式知识访问。