Various architecture frameworks for software, systems, and enterprises have been proposed in the literature. They identified several stakeholders and defined modeling perspectives, architecture viewpoints, and views to frame and address stakeholder concerns. However, the stakeholders with data science and Machine Learning (ML) related concerns, such as data scientists and data engineers, are yet to be included in existing architecture frameworks. Only this way can we envision a holistic system architecture description of an ML-enabled system. Note that the ML component behavior and functionalities are special and should be distinguished from traditional software system behavior and functionalities. The main reason is that the actual functionality should be inferred from data instead of being specified at design time. Additionally, the structural models of ML components, such as ML model architectures, are typically specified using different notations and formalisms from what the Software Engineering (SE) community uses for software structural models. Yet, these two aspects, namely ML and non-ML, are becoming so intertwined that it necessitates an extension of software architecture frameworks and modeling practices toward supporting ML-enabled system architectures. In this paper, we address this gap through an empirical study using an online survey instrument. We surveyed 61 subject matter experts from over 25 organizations in 10 countries.
翻译:文献中已提出多种面向软件、系统及企业的架构框架。这些框架明确了若干利益相关方,并定义了建模视角、架构视点与视图,以界定和解决利益相关方的关切。然而,现有架构框架尚未纳入具有数据科学与机器学习相关关切(如数据科学家与数据工程师)的利益相关方。唯有如此,我们才能构建出机器学习赋能系统的整体架构描述。需指出,机器学习组件的行为与功能具有特殊性,应与传统软件系统的行为与功能加以区分。其主要原因在于实际功能需从数据中推断得出,而非在设计时预先指定。此外,机器学习组件的结构模型(如机器学习模型架构)通常采用与软件工程领域用于软件结构模型不同的符号体系与形式化方法进行描述。然而,机器学习与非机器学习这两个方面正日益紧密交织,这要求对软件架构框架与建模实践进行扩展,以支持机器学习赋能的系统架构。本文通过采用在线调查工具的实证研究来填补这一空白。我们调查了来自10个国家、超过25个组织的61位领域专家。