Towards Secure MLOps: Surveying Attacks, Mitigation Strategies, and Research Challenges

The rapid adoption of machine learning (ML) technologies has driven organizations across diverse sectors to seek efficient and reliable methods to accelerate model development-to-deployment. Machine Learning Operations (MLOps) has emerged as an integrative approach addressing these requirements by unifying relevant roles and streamlining ML workflows. As the MLOps market continues to grow, securing these pipelines has become increasingly critical. However, the unified nature of MLOps ecosystem introduces vulnerabilities, making them susceptible to adversarial attacks where a single misconfiguration can lead to compromised credentials, severe financial losses, damaged public trust, and the poisoning of training data. Our paper presents a systematic application of the MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) framework, supplemented by reviews of white and grey literature, to systematically assess attacks across different phases of the MLOps ecosystem. We begin by reviewing prior work in this domain, then present our taxonomy and introduce a threat model that captures attackers with different knowledge and capabilities. We then present a structured taxonomy of attack techniques explicitly mapped to corresponding phases of the MLOps ecosystem, supported by examples drawn from red-teaming exercises and real-world incidents. This is followed by a taxonomy of mitigation strategies aligned with these attack categories, offering actionable early-stage defenses to strengthen the security of MLOps ecosystem. Given the gradual evolution and adoption of MLOps, we further highlight key research gaps that require immediate attention. Our work emphasizes the importance of implementing robust security protocols from the outset, empowering practitioners to safeguard MLOps ecosystem against evolving cyber attacks.

翻译：机器学习（ML）技术的快速普及推动各行业组织寻求高效可靠的方法以加速模型从开发到部署的流程。机器学习运维（MLOps）作为一种集成方法应运而生，通过整合相关角色并优化ML工作流来满足这些需求。随着MLOps市场的持续增长，保障这些流水线的安全性变得日益关键。然而，MLOps生态系统的集成特性引入了诸多脆弱性，使其易受对抗性攻击——单一配置错误就可能导致凭证泄露、严重财务损失、公众信任受损以及训练数据污染。本文系统应用MITRE ATLAS（人工智能系统对抗威胁图谱）框架，并辅以白皮书与灰色文献综述，系统评估MLOps生态系统各阶段面临的攻击。我们首先回顾该领域的已有研究，随后提出分类体系并引入涵盖不同知识水平和能力的攻击者威胁模型。接着，我们构建了与MLOps生态系统各阶段明确映射的攻击技术结构化分类法，并通过红队演练和真实案例加以佐证。在此基础上，我们提出与攻击类别对应的缓解策略分类法，为强化MLOps生态系统安全提供可操作的早期防御方案。鉴于MLOps的渐进式演进与采用现状，我们进一步指出亟待解决的关键研究缺口。本研究强调从初始阶段实施强健安全协议的重要性，助力从业者有效防护MLOps生态系统以应对不断演变的网络攻击。