Systems that use Machine Learning (ML) have become commonplace for companies that want to improve their products and processes. Literature suggests that Requirements Engineering (RE) can help address many problems when engineering ML-enabled systems. However, the state of empirical evidence on how RE is applied in practice in the context of ML-enabled systems is mainly dominated by isolated case studies with limited generalizability. We conducted an international survey to gather practitioner insights into the status quo and problems of RE in ML-enabled systems. We gathered 188 complete responses from 25 countries. We conducted quantitative statistical analyses on contemporary practices using bootstrapping with confidence intervals and qualitative analyses on the reported problems involving open and axial coding procedures. We found significant differences in RE practices within ML projects. For instance, (i) RE-related activities are mostly conducted by project leaders and data scientists, (ii) the prevalent requirements documentation format concerns interactive Notebooks, (iii) the main focus of non-functional requirements includes data quality, model reliability, and model explainability, and (iv) main challenges include managing customer expectations and aligning requirements with data. The qualitative analyses revealed that practitioners face problems related to lack of business domain understanding, unclear goals and requirements, low customer engagement, and communication issues. These results help to provide a better understanding of the adopted practices and of which problems exist in practical environments. We put forward the need to adapt further and disseminate RE-related practices for engineering ML-enabled systems.
翻译:采用机器学习(ML)的系统已成为企业改进产品与流程的常见手段。文献研究表明,需求工程(RE)有助于解决ML系统开发中的诸多问题。然而,关于实际ML系统开发中如何应用RE的实证证据,目前主要局限于泛化能力有限的孤立案例研究。我们开展了一项国际调查,收集从业者对ML系统需求工程现状与问题的实践见解。累计获得来自25个国家的188份完整回复。我们采用自助法置信区间对当前实践进行定量统计分析,并运用开放式编码与轴向编码方法对报告的问题进行定性分析。研究发现ML项目中的RE实践存在显著差异,例如:(i)RE相关活动主要由项目负责人与数据科学家执行;(ii)主流的需求文档格式为交互式笔记本;(iii)非功能性需求主要关注数据质量、模型可靠性与模型可解释性;(iv)主要挑战包括管理客户期望与协调需求与数据的对应关系。定性分析揭示从业者面临的问题包括缺乏业务领域理解、目标与需求不明确、客户参与度低以及沟通障碍。这些结果有助于深入理解实际环境中采用的具体实践与存在的问题。我们提出需进一步调整和推广面向ML系统开发的RE相关实践。