With the increasing trend of Machine Learning (ML) enabled software applications, the paradigm of ML Operations (MLOps) has gained tremendous attention of researchers and practitioners. MLOps encompasses the practices and technologies for streamlining the resources and monitoring needs of operationalizing ML models. Software development practitioners need access to the detailed and easily understandable knowledge of MLOps workflows, practices, challenges and solutions to effectively and efficiently support the adoption of MLOps. Whilst the academic and industry literature on the MLOps has been growing rapidly, there have been relatively a few attempts at systematically synthesizing and analyzing the vast amount of existing literature of MLOps for improving ease of access and understanding. We conducted a Multivocal Literature Review (MLR) of 150 relevant academic studies and 48 gray literature to provide a comprehensive body of knowledge on MLOps. Through this MLR, we identified the emerging MLOps practices, adoption challenges and solutions related to various areas, including development and operation of complex pipelines, managing production at scale, managing artifacts, and ensuring quality, security, governance, and ethical aspects. We also report the socio-technical aspect of MLOps relating to diverse roles involved and collaboration practices across them through the MLOps lifecycle. We assert that this MLR provides valuable insights to researchers and practitioners seeking to navigate the rapidly evolving landscape of MLOps. We also identify the open issues that need to be addressed in order to advance the current state-of-the-art of MLOps.
翻译:随着机器学习(ML)驱动的软件应用日益普及,机器学习运维(MLOps)这一范式已引起研究人员和实践者的极大关注。MLOps包含一系列实践与技术,旨在简化ML模型部署所需的资源管理与监控需求。软件开发实践者需要获取关于MLOps工作流、实践、挑战及解决方案的详细且易于理解的知识,以有效且高效地支持MLOps的采用。尽管关于MLOps的学术与行业文献快速增长,但针对海量现有MLOps文献进行系统性综合与分析以提升其可访问性与可理解性的研究仍相对有限。我们通过对150篇相关学术研究与48篇灰色文献进行多声部文献综述(MLR),构建了关于MLOps的全面知识体系。通过本次MLR,我们识别了新兴的MLOps实践、采用挑战及与各领域相关的解决方案,包括复杂流水线的开发与运维、规模化生产管理、制品管理,以及质量、安全、治理与伦理方面的保障。我们还报告了MLOps的社会技术层面,涉及MLOps生命周期中不同角色的参与及其间的协作实践。我们主张,本MLR为寻求探索快速演进的MLOps领域的研究人员与实践者提供了有价值的见解。同时,我们指出了为推进当前MLOps技术前沿所需解决的开放性问题。