Nowadays, Machine Learning (ML) is experiencing tremendous popularity that has never been seen before. The operationalization of ML models is governed by a set of concepts and methods referred to as Machine Learning Operations (MLOps). Nevertheless, researchers, as well as professionals, often focus more on the automation aspect and neglect the continuous deployment and monitoring aspects of MLOps. As a result, there is a lack of continuous learning through the flow of feedback from production to development, causing unexpected model deterioration over time due to concept drifts, particularly when dealing with scarce data. This work explores the complete application of MLOps in the context of scarce data analysis. The paper proposes a new holistic approach to enhance biomedical image analysis. Our method includes: a fingerprinting process that enables selecting the best models, datasets, and model development strategy relative to the image analysis task at hand; an automated model development stage; and a continuous deployment and monitoring process to ensure continuous learning. For preliminary results, we perform a proof of concept for fingerprinting in microscopic image datasets.
翻译:如今,机器学习正经历前所未有的广泛关注。机器学习模型的操作化由一组被称为MLOps的概念和方法所主导。然而,研究人员和从业者往往更关注自动化层面,而忽视了MLOps中持续部署与监控的环节。这使得生产到开发之间缺乏通过反馈流实现的持续学习,导致模型因概念漂移而随时间推移出现意外的性能退化,尤其在处理稀缺数据时更为显著。本研究探索了在稀缺数据分析场景中完整应用MLOps的路径,提出一种增强生物医学图像分析的新整体方法。该方法包括:指纹识别流程,用于根据当前图像分析任务选择最优模型、数据集及模型开发策略;自动化模型开发阶段;以及确保持续学习的持续部署与监控流程。初步结果中,我们在显微图像数据集上完成了指纹识别概念验证。