Engineering successful machine learning (ML)-enabled systems poses various challenges from both a theoretical and a practical side. Among those challenges are how to effectively address unrealistic expectations of ML capabilities from customers, managers and even other team members, and how to connect business value to engineering and data science activities composed by interdisciplinary teams. In this paper, we present PerSpecML, a perspective-based approach for specifying ML-enabled systems that helps practitioners identify which attributes, including ML and non-ML components, are important to contribute to the overall system's quality. The approach involves analyzing 59 concerns related to typical tasks that practitioners face in ML projects, grouping them into five perspectives: system objectives, user experience, infrastructure, model, and data. Together, these perspectives serve to mediate the communication between business owners, domain experts, designers, software and ML engineers, and data scientists. The creation of PerSpecML involved a series of validations conducted in different contexts: (i) in academia, (ii) with industry representatives, and (iii) in two real industrial case studies. As a result of the diverse validations and continuous improvements, PerSpecML stands as a promising approach, poised to positively impact the specification of ML-enabled systems, particularly helping to reveal key components that would have been otherwise missed without using PerSpecML.
翻译:构建成功的机器学习(ML)支持系统在理论与实践层面均面临诸多挑战。这些挑战包括如何有效应对客户、管理者乃至团队其他成员对ML能力的非现实期望,以及如何将业务价值与跨学科团队所从事的工程和数据科学活动相衔接。本文提出PerSpecML——一种基于视角的ML支持系统规范方法,旨在帮助从业者识别哪些属性(包括ML与非ML组件)对实现系统整体质量具有关键作用。该方法通过分析ML项目从业者面临典型任务中的59个关注点,将其归为五个视角:系统目标、用户体验、基础设施、模型与数据。这些视角共同促进了业务所有者、领域专家、设计师、软件与ML工程师以及数据科学家之间的沟通协调。PerSpecML的构建过程涵盖不同场景的系列验证:(i)学术环境,(ii)产业代表合作,(iii)两项真实工业案例研究。通过多样化验证与持续改进,PerSpecML已成为具有前景的方法,有望对ML支持系统的规范产生积极影响,尤其有助于揭示那些未经该方法可能被遗漏的关键组件。