Replication and Verifiability in Requirements Engineering: the NLP for RE Case

[Context] Study replication is essential for theory building and empirical validation. [Problem] Despite its empirical vocation, requirements engineering (RE) research has given limited attention to study replication, threatening thereby the ability to verify existing results and use previous research as a baseline. [Solution] In this perspective paper, we -- a group of experts in natural language processing (NLP) for RE -- reflect on the challenges for study replication in NLP for RE. Concretely: (i) we report on hands-on experiences of replication, (ii) we review the state-of-the-art and extract replication-relevant information, and (iii) we identify, through focus groups, challenges across two typical dimensions of replication: data annotation and tool reconstruction. NLP for RE is a research area that is suitable for study replication since it builds on automated tools which can be shared, and quantitative evaluation that enable direct comparisons between results. [Results] Replication is hampered by several factors, including the context specificity of the studies, the heterogeneity of the tasks involving NLP, the tasks' inherent hairiness, and, in turn, the heterogeneous reporting structure. To address these issues, we propose an ID card whose goal is to provide a structured summary of research papers, with an emphasis on replication-relevant information. [Contribution] We contribute in this study with: (i) a set of reflections on replication in NLP for RE, (ii) a set of recommendations for researchers in the field to increase their awareness on the topic, and (iii) an ID card that is intended to primarily foster replication, and can also be used in other contexts, e.g., for educational purposes. Practitioners will also benefit from the results since replications increase confidence on research findings.

翻译：[背景] 研究可重复性对于理论构建与实证验证至关重要。[问题] 尽管需求工程（RE）研究具有实证取向，但对研究可重复性的关注有限，这威胁到验证现有结果及将先前研究作为基线基准的能力。[解决方案] 在这篇观点性论文中，我们——一个专注于需求工程自然语言处理（NLP for RE）的专家团队——反思了该领域中研究可重复性面临的挑战。具体而言：（i）我们报告了可重复性研究的实践经验，（ii）我们回顾了当前研究现状并提取了与可重复性相关的信息，（iii）通过焦点小组讨论，我们识别了可重复性两个典型维度——数据标注与工具重建——中的挑战。需求工程中的自然语言处理研究领域适合进行可重复性研究，因其建立在可共享的自动化工具及可实现结果直接对比的定量评估基础之上。[结果] 可重复性受多重因素阻碍，包括研究的上下文特异性、涉及自然语言处理任务的异质性、任务固有的复杂性，以及相应的报告结构差异性。为解决这些问题，我们提出了一份"标识卡"，旨在提供研究论文的结构化摘要，重点突出与可重复性相关的信息。[贡献] 本研究贡献包括：（i）关于需求工程自然语言处理中可重复性的一系列反思，（ii）为该领域研究者提升相关认知提出的系列建议，（iii）一份旨在促进可重复性研究的标识卡，该卡片亦可应用于其他场景（如教学目的）。实践者亦能受益于本研究成果，因为可重复性研究能够增强对研究发现的可信度。