Towards MLOps: A DevOps Tools Recommender System for Machine Learning System

Applying DevOps practices to machine learning system is termed as MLOps and machine learning systems evolve on new data unlike traditional systems on requirements. The objective of MLOps is to establish a connection between different open-source tools to construct a pipeline that can automatically perform steps to construct a dataset, train the machine learning model and deploy the model to the production as well as store different versions of model and dataset. Benefits of MLOps is to make sure the fast delivery of the new trained models to the production to have accurate results. Furthermore, MLOps practice impacts the overall quality of the software products and is completely dependent on open-source tools and selection of relevant open-source tools is considered as challenged while a generalized method to select an appropriate open-source tools is desirable. In this paper, we present a framework for recommendation system that processes the contextual information (e.g., nature of data, type of the data) of the machine learning project and recommends a relevant toolchain (tech-stack) for the operationalization of machine learning systems. To check the applicability of the proposed framework, four different approaches i.e., rule-based, random forest, decision trees and k-nearest neighbors were investigated where precision, recall and f-score is measured, the random forest out classed other approaches with highest f-score value of 0.66.

翻译：将DevOps实践应用于机器学习系统称为MLOps，与传统系统随需求演变不同，机器学习系统随新数据演进。MLOps的目标是连接不同开源工具构建自动化流水线，完成数据集构建、模型训练、模型部署至生产环境以及存储不同版本的模型与数据集。MLOps的优势在于确保新训练模型快速交付至生产环境以获取准确结果。此外，MLOps实践影响软件产品的整体质量，且完全依赖于开源工具，而选择相关开源工具被视为挑战，因此亟需一种选择合适开源工具的通用方法。本文提出一个推荐系统框架，该框架处理机器学习项目的上下文信息（如数据性质、数据类型），并推荐相关工具链（技术栈）用于机器学习系统的运维。为检验所提框架的适用性，研究了四种不同方法（基于规则、随机森林、决策树与K近邻），测量了精确率、召回率与F值。其中随机森林以最高0.66的F值优于其他方法。

相关内容

TOOLS

关注 1

这个新版本的工具会议系列恢复了从1989年到2012年的50个会议的传统。工具最初是“面向对象语言和系统的技术”，后来发展到包括软件技术的所有创新方面。今天许多最重要的软件概念都是在这里首次引入的。2019年TOOLS 50+1在俄罗斯喀山附近举行，以同样的创新精神、对所有与软件相关的事物的热情、科学稳健性和行业适用性的结合以及欢迎该领域所有趋势和社区的开放态度，延续了该系列。官网链接：http://tools2019.innopolis.ru/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日