With the upsurge of interest in artificial intelligence machine learning (ML) algorithms, originally developed in academic environments, are now being deployed as parts of real-life systems that deal with large amounts of heterogeneous, dynamic, and high-dimensional data. Deployment of ML methods in real life is prone to challenges across the whole system life-cycle from data management to systems deployment, monitoring, and maintenance. Data-Oriented Architecture (DOA) is an emerging software engineering paradigm that has the potential to mitigate these challenges by proposing a set of principles to create data-driven, loosely coupled, decentralised, and open systems. However DOA as a concept is not widespread yet, and there is no common understanding of how it can be realised in practice. This review addresses that problem by contextualising the principles that underpin the DOA paradigm through the ML system challenges. We explore the extent to which current architectures of ML-based real-world systems have implemented the DOA principles. We also formulate open research challenges and directions for further development of the DOA paradigm.
翻译:随着人工智能热潮的兴起,最初在学术环境中开发的机器学习算法现在正被部署为处理大量异构、动态和高维数据的现实系统组成部分。机器学习方法在实际部署中面临从数据管理到系统部署、监控和维护的整个系统生命周期挑战。数据导向架构(DOA)是一种新兴的软件工程范式,通过提出一组创建数据驱动、松散耦合、去中心化和开放系统的原则,有望缓解这些挑战。然而DOA作为一种概念尚未广泛普及,且业界对其具体实现方式缺乏共识。本综述通过将支撑DOA范式的原则置于机器学习系统挑战的背景下,系统分析了这一问题。我们探讨了当前基于机器学习的真实世界系统架构在多大程度上实现了DOA原则,同时提出了DOA范式进一步发展的开放性研究挑战与方向。