The advent of Machine Learning as a Service (MLaaS) has heightened the trade-off between model explainability and security. In particular, explainability techniques, such as counterfactual explanations, inadvertently increase the risk of model extraction attacks, enabling unauthorized replication of proprietary models. In this paper, we formalize and characterize the risks and inherent complexity of model reconstruction, focusing on the "oracle'' queries required for faithfully inferring the underlying prediction function. We present the first formal analysis of model extraction attacks through the lens of competitive analysis, establishing a foundational framework to evaluate their efficiency. Focusing on models based on additive decision trees (e.g., decision trees, gradient boosting, and random forests), we introduce novel reconstruction algorithms that achieve provably perfect fidelity while demonstrating strong anytime performance. Our framework provides theoretical bounds on the query complexity for extracting tree-based model, offering new insights into the security vulnerabilities of their deployment.
翻译:机器学习即服务(MLaaS)的出现加剧了模型可解释性与安全性之间的权衡。特别是,反事实解释等可解释性技术无意中增加了模型提取攻击的风险,使得未经授权复制专有模型成为可能。本文对模型重构的风险与内在复杂性进行了形式化描述与特征分析,重点关注忠实推断底层预测函数所需的“预言机”查询。我们首次通过竞争性分析的视角对模型提取攻击进行了形式化分析,建立了一个评估其效率的基础框架。针对基于加法决策树的模型(如决策树、梯度提升和随机森林),我们提出了新颖的重构算法,这些算法在实现可证明的完美保真度的同时,展现了强大的任意时间性能。我们的框架为提取基于树的模型提供了查询复杂度的理论界限,为其部署的安全漏洞提供了新的见解。