Traditional automation technologies alone are not sufficient to enable driverless operation of trains (called Grade of Automation (GoA) 4) on non-restricted infrastructure. The required perception tasks are nowadays realized using Machine Learning (ML) and thus need to be developed and deployed reliably and efficiently. One important aspect to achieve this is to use an MLOps process for tackling improved reproducibility, traceability, collaboration, and continuous adaptation of a driverless operation to changing conditions. MLOps mixes ML application development and operation (Ops) and enables high frequency software releases and continuous innovation based on the feedback from operations. In this paper, we outline a safe MLOps process for the continuous development and safety assurance of ML-based systems in the railway domain. It integrates system engineering, safety assurance, and the ML life-cycle in a comprehensive workflow. We present the individual stages of the process and their interactions. Moreover, we describe relevant challenges to automate the different stages of the safe MLOps process.
翻译:传统自动化技术本身不足以在非受限基础设施上实现无人驾驶列车运行(称为自动化等级4级)。所需的感知任务如今通过机器学习实现,因此需要可靠且高效地进行开发与部署。实现这一目标的重要途径之一,是采用MLOps流程以提升无人驾驶运行的可复现性、可追溯性、协作能力及对不断变化条件的持续适应性。MLOps融合了机器学习应用开发与运维,能够基于运维反馈实现高频次软件发布和持续创新。本文概述了一种面向铁路领域基于机器学习系统的持续开发与安全保证的安全MLOps流程。该流程将系统工程、安全保证与机器学习生命周期整合为综合性工作流。我们阐述了流程各阶段及其交互关系,并探讨了自动化实施安全MLOps流程各阶段所面临的关键挑战。