Explainability is increasingly required in AI-enabled software systems to support transparency, user trust, and compliance. Yet, explainability requirements are often written ad hoc, and unguided large language model support can yield vague, inconsistent, or incomplete statements. This paper presents a sequential, guideline-driven workflow for formulating explainability requirements and evaluates its tool-based operationalization. We first elicited candidate quality properties through a structured literature review and developer interviews. We then prioritized these properties in an online survey with practitioners (n=20) and derived a concise guideline of ten core properties with actionable formulation instructions. Next, we operationalized the guideline in a web-based tool that supports an iterative workflow of drafting, property-based checks, and revision. We evaluated the workflow in two complementary studies. In a workshop with requirements engineers (n=6), tool support reduced formulation time by 23.5% on average (Wilcoxon p=0.021). In an independent online study with software developers (n=18), tool-supported and manually written requirements received comparable ratings for implementability and formulation quality, with a descriptive slight preference tendency toward the tool-supported versions. Overall, our results suggest that combining a prioritized quality guideline with lightweight LLM support can reduce formulation effort while producing requirements that are perceived comparably to manually written ones.
翻译:摘要:在人工智能驱动的软件系统中,可解释性日益成为支持透明度、用户信任与合规性的关键要求。然而,可解释性需求的编写往往流于随意,且未经过指导的大语言模型支持可能产生模糊、不一致或不完整的表述。本文提出一种基于指南的序贯式工作流用于制定可解释性需求,并评估其工具化实施效果。我们首先通过结构化文献综述和开发者访谈提炼出候选质量属性,随后借助包含20名从业者的在线调查对这些属性进行优先级排序,最终形成包含十大核心属性及可操作制定指令的精简指南。接着,我们将该指南部署到支持迭代式起草、属性检查与修订工作流的网页工具中。通过两项互补研究对工作流进行评估:在由6名需求工程师参与的工作坊中,工具支持使需求制定时间平均缩短23.5%(Wilcoxon检验p=0.021);在由18名软件开发者参与的独立在线研究中,工具辅助编写的需求与人工编写的需求在可实现性和表述质量方面获得可比评级,且前者呈现描述性偏好倾向。总体而言,本研究表明将优先级质量指南与轻量级语言模型支持相结合,可在保持需求获得与人工编写需求相当认知评价的同时,减少制定工作量。