模型提取攻击与防御的系统性综述：现状与展望 (A Systematic Survey of Model Extraction Attacks and Defenses: State-of-the-Art and Perspectives)

Machine learning (ML) models have significantly grown in complexity and utility, driving advances across multiple domains. However, substantial computational resources and specialized expertise have historically restricted their wide adoption. Machine-Learning-as-a-Service (MLaaS) platforms have addressed these barriers by providing scalable, convenient, and affordable access to sophisticated ML models through user-friendly APIs. While this accessibility promotes widespread use of advanced ML capabilities, it also introduces vulnerabilities exploited through Model Extraction Attacks (MEAs). Recent studies have demonstrated that adversaries can systematically replicate a target model's functionality by interacting with publicly exposed interfaces, posing threats to intellectual property, privacy, and system security. In this paper, we offer a comprehensive survey of MEAs and corresponding defense strategies. We propose a novel taxonomy that classifies MEAs according to attack mechanisms, defense approaches, and computing environments. Our analysis covers various attack techniques, evaluates their effectiveness, and highlights challenges faced by existing defenses, particularly the critical trade-off between preserving model utility and ensuring security. We further assess MEAs within different computing paradigms and discuss their technical, ethical, legal, and societal implications, along with promising directions for future research. This systematic survey aims to serve as a valuable reference for researchers, practitioners, and policymakers engaged in AI security and privacy. Additionally, we maintain an online repository continuously updated with related literature at https://github.com/kzhao5/ModelExtractionPapers.

翻译：机器学习（ML）模型在复杂性和实用性方面已显著提升，推动了多个领域的进步。然而，历史上巨大的计算资源和专业知识需求限制了其广泛应用。机器学习即服务（MLaaS）平台通过用户友好的API提供可扩展、便捷且经济的高性能ML模型访问，从而解决了这些障碍。尽管这种可访问性促进了先进ML能力的广泛使用，但也引入了通过模型提取攻击（MEAs）被利用的漏洞。近期研究表明，攻击者可通过与公开暴露的接口交互，系统性地复制目标模型的功能，从而对知识产权、隐私和系统安全构成威胁。本文对MEAs及相应防御策略进行了全面综述。我们提出了一种新颖的分类法，根据攻击机制、防御方法和计算环境对MEAs进行分类。我们的分析涵盖了多种攻击技术，评估了其有效性，并强调了现有防御面临的挑战，特别是保持模型效用与确保安全性之间的关键权衡。我们进一步评估了不同计算范式下的MEAs，并讨论了其技术、伦理、法律和社会影响，以及未来研究的有前景方向。本系统性综述旨在为从事AI安全和隐私的研究人员、从业者和政策制定者提供有价值的参考。此外，我们在https://github.com/kzhao5/ModelExtractionPapers 维护了一个持续更新的相关文献在线存储库。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/