Traceability, the ability to trace relevant software artifacts to support reasoning about the quality of the software and its development process, plays a crucial role in requirements and software engineering, particularly for safety-critical systems. In this chapter, we provide a comprehensive overview of the representative tasks in requirement traceability for which natural language processing (NLP) and related techniques have made considerable progress in the past decade. We first present the definition of traceability in the context of requirements and the overall engineering process, as well as other important concepts related to traceability tasks. Then, we discuss two tasks in detail, including trace link recovery and trace link maintenance. We also introduce two other related tasks concerning when trace links are used in practical contexts. For each task, we explain the characteristics of the task, how it can be approached through NLP techniques, and how to design and conduct the experiment to demonstrate the performance of the NLP techniques. We further discuss practical considerations on how to effectively apply NLP techniques and assess their effectiveness regarding the data set collection, the metrics selection, and the role of humans when evaluating the NLP approaches. Overall, this chapter prepares the readers with the fundamental knowledge of designing automated traceability solutions enabled by NLP in practice.
翻译:可追溯性,即追溯相关软件制品以支持对软件及其开发过程质量进行推理的能力,在需求工程和软件工程中发挥着关键作用,尤其对安全关键系统而言。本章全面概述了过去十年间自然语言处理及相关技术在需求可追溯性代表性任务中取得的显著进展。我们首先阐述需求及整体工程过程中可追溯性的定义,以及可追溯性任务相关的其他重要概念。随后详细讨论两类任务,包括追溯链接恢复与追溯链接维护,并介绍实际场景中使用追溯链接时涉及的两项其他相关任务。针对每项任务,我们说明其特点、如何通过NLP技术实现,以及如何设计并开展实验验证NLP技术的性能。进一步探讨实际应用中的考量因素,包括数据集收集、指标选择以及评估NLP方法时人类参与的作用,以有效应用NLP技术并评估其效果。总体而言,本章为读者提供设计基于NLP的自动化可追溯性解决方案所需的基础知识。