Machine learning and AI have been recently embraced by many companies. Machine Learning Operations, (MLOps), refers to the use of continuous software engineering processes, such as DevOps, in the deployment of machine learning models to production. Nevertheless, not all machine learning initiatives successfully transition to the production stage owing to the multitude of intricate factors involved. This article discusses the issues that exist in several components of the MLOps pipeline, namely the data manipulation pipeline, model building pipeline, and deployment pipeline. A systematic mapping study is performed to identify the challenges that arise in the MLOps system categorized by different focus areas. Using this data, realistic and applicable recommendations are offered for tools or solutions that can be used for their implementation. The main value of this work is it maps distinctive challenges in MLOps along with the recommended solutions outlined in our study. These guidelines are not specific to any particular tool and are applicable to both research and industrial settings.
翻译:近年来,机器学习和人工智能已被众多企业广泛采纳。机器学习运维(MLOps)指在将机器学习模型部署至生产环境时,采用如DevOps等持续软件工程流程的实践。然而,由于涉及众多复杂因素,并非所有机器学习项目都能成功过渡到生产阶段。本文探讨了MLOps流程中若干环节存在的问题,具体包括数据处理流水线、模型构建流水线与部署流水线。本研究通过系统性综述方法,识别了MLOps系统中按不同关注领域分类的各类挑战。基于此分析,我们为实际实施提供了切实可行的工具或解决方案建议。本工作的主要价值在于系统梳理了MLOps中的典型挑战,并结合研究提出了相应的解决方案建议。这些指导原则不局限于特定工具,同时适用于学术研究与工业实践场景。