Text2Stories：评估利益相关者访谈与生成用户故事间的一致性 (Text2Stories: Evaluating the Alignment Between Stakeholder Interviews and Generated User Stories)

Large language models (LLMs) can be employed for automating the generation of software requirements from natural language inputs such as the transcripts of elicitation interviews. However, evaluating whether those derived requirements faithfully reflect the stakeholders' needs remains a largely manual task. We introduce Text2Stories, a task and metrics for text-to-story alignment that allow quantifying the extent to which requirements (in the form of user stories) match the actual needs expressed by the elicitation session participants. Given an interview transcript and a set of user stories, our metric quantifies (i) correctness: the proportion of stories supported by the transcript, and (ii) completeness: the proportion of transcript supported by at least one story. We segment the transcript into text chunks and instantiate the alignment as a matching problem between chunks and stories. Experiments over four datasets show that an LLM-based matcher achieves 0.86 macro-F1 on held-out annotations, while embedding models alone remain behind but enable effective blocking. Finally, we show how our metrics enable the comparison across sets of stories (e.g., human vs. generated), positioning Text2Stories as a scalable, source-faithful complement to existing user-story quality criteria.

翻译：大型语言模型（LLM）可用于从自然语言输入（如需求获取访谈的文字记录）中自动化生成软件需求。然而，评估这些衍生出的需求是否忠实反映利益相关者的需求，在很大程度上仍是一项手动任务。我们提出了Text2Stories，这是一项用于文本到故事对齐的任务及度量标准，旨在量化需求（以用户故事形式呈现）与需求获取会议参与者所表达的实际需求之间的匹配程度。给定一份访谈记录和一组用户故事，我们的度量标准量化了（i）正确性：由访谈记录支持的故事比例，以及（ii）完整性：至少被一个故事支持的访谈记录比例。我们将访谈记录分割为文本块，并将对齐问题实例化为文本块与故事之间的匹配问题。在四个数据集上的实验表明，基于LLM的匹配器在留出标注上达到了0.86的宏观F1分数，而仅使用嵌入模型的方法虽然落后，但能实现有效的分块筛选。最后，我们展示了我们的度量标准如何支持跨故事集（例如，人工编写与自动生成）的比较，从而将Text2Stories定位为现有用户故事质量标准的、可扩展且忠实于来源的补充。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/