Machine Learning (ML) Operations (MLOps) frameworks have been conceived to support developers and AI engineers in managing the lifecycle of their ML models. While such frameworks provide a wide range of features, developers may leverage only a subset of them, while missing some highly desired features. This paper investigates the practical use and desired feature enhancements of eight popular open-source MLOps frameworks. Specifically, we analyze their usage by dependent projects on GitHub, examining how they invoke the frameworks' APIs and commands. Then, we qualitatively analyze feature requests and enhancements mined from the frameworks' issue trackers, relating these desired improvements to the previously identified usage features. Results indicate that MLOps frameworks are rarely used out-of-the-box and are infrequently integrated into GitHub Workflows, but rather, developers use their APIs to implement custom functionality in their projects. Used features concern core ML phases and whole infrastructure governance, sometimes leveraging multiple frameworks with complementary features. The mapping with feature requests highlights that users mainly ask for enhancements to core features of the frameworks, but also better API exposure and CI/CD integration.
翻译:机器学习运维(MLOps)框架旨在帮助开发者和AI工程师管理其机器学习模型的全生命周期。尽管此类框架提供了广泛的功能特性,但开发者可能仅利用其中的部分功能,同时仍缺乏某些亟需的特性。本文研究了八种主流开源MLOps框架的实际使用情况与期望功能增强。具体而言,我们通过分析GitHub上依赖项目对这些框架的使用方式,考察其如何调用框架的API接口与命令行工具。随后,我们对从框架问题追踪系统中挖掘出的功能需求与改进建议进行定性分析,并将这些期望改进与先前识别的使用特征进行关联。研究结果表明:MLOps框架很少被直接开箱即用,也较少集成到GitHub工作流中;相反,开发者主要通过调用其API在项目中实现定制化功能。被使用的功能主要涉及核心机器学习阶段与整体基础设施治理,有时会结合多个具有互补特性的框架协同工作。与功能需求的映射分析显示:用户主要期望增强框架的核心功能,同时也需要更完善的API开放程度和CI/CD集成支持。