Consistent and holistic expression of software requirements is important for the success of software projects. In this study, we aim to enhance the efficiency of the software development processes by automatically identifying conflicting and duplicate software requirement specifications. We formulate the conflict and duplicate detection problem as a requirement pair classification task. We design a novel transformers-based architecture, SR-BERT, which incorporates Sentence-BERT and Bi-encoders for the conflict and duplicate identification task. Furthermore, we apply supervised multi-stage fine-tuning to the pre-trained transformer models. We test the performance of different transfer models using four different datasets. We find that sequentially trained and fine-tuned transformer models perform well across the datasets with SR-BERT achieving the best performance for larger datasets. We also explore the cross-domain performance of conflict detection models and adopt a rule-based filtering approach to validate the model classifications. Our analysis indicates that the sentence pair classification approach and the proposed transformer-based natural language processing strategies can contribute significantly to achieving automation in conflict and duplicate detection
翻译:软件需求的一致性与整体性表达对软件项目成功至关重要。本研究旨在通过自动识别冲突及重复的软件需求规约,提升软件开发流程的效率。我们将冲突与重复检测问题形式化为需求对分类任务,设计了一种基于Transformer的新型架构SR-BERT,该架构融合了Sentence-BERT与双编码器以执行冲突与重复识别任务。此外,我们对预训练Transformer模型实施了监督式多阶段微调策略,使用四个不同数据集测试了各迁移模型的性能。研究发现,经过序列化训练与微调的Transformer模型在各数据集中均表现良好,其中SR-BERT在较大数据集上取得了最优性能。我们还探究了冲突检测模型的跨领域性能,并采用基于规则的过滤方法验证模型分类结果。分析表明,句子对分类方法及所提出的基于Transformer的自然语言处理策略,能显著推动冲突与重复检测自动化的实现。