硬标签密码分析模型提取是否真的具有多项式复杂度？ (Is the Hard-Label Cryptanalytic Model Extraction Really Polynomial?)

Deep Neural Networks (DNNs) have attracted significant attention, and their internal models are now considered valuable intellectual assets. Extracting these internal models through access to a DNN is conceptually similar to extracting a secret key via oracle access to a block cipher. Consequently, cryptanalytic techniques, particularly differential-like attacks, have been actively explored recently. ReLU-based DNNs are the most commonly and widely deployed architectures. While early works (e.g., Crypto 2020, Eurocrypt 2024) assume access to exact output logits, which are usually invisible, more recent works (e.g., Asiacrypt 2024, Eurocrypt 2025) focus on the hard-label setting, where only the final classification result (e.g., "dog" or "car") is available to the attacker. Notably, Carlini et al. (Eurocrypt 2025) demonstrated that model extraction is feasible in polynomial time even under this restricted setting. In this paper, we first show that the assumptions underlying their attack become increasingly unrealistic as the attack-target depth grows. In practice, satisfying these assumptions requires an exponential number of queries with respect to the attack depth, implying that the attack does not always run in polynomial time. To address this critical limitation, we propose a novel attack method called CrossLayer Extraction. Instead of directly extracting the secret parameters (e.g., weights and biases) of a specific neuron, which incurs exponential cost, we exploit neuron interactions across layers to extract this information from deeper layers. This technique significantly reduces query complexity and mitigates the limitations of existing model extraction approaches.

翻译：深度神经网络（DNNs）已引起广泛关注，其内部模型如今被视为宝贵的知识产权资产。通过访问DNN来提取其内部模型，在概念上类似于通过访问分组密码的预言机来提取密钥。因此，密码分析技术，特别是类差分攻击，近年来得到了积极探索。基于ReLU的DNN是目前最常用且部署最广泛的架构。早期研究（例如Crypto 2020, Eurocrypt 2024）假设攻击者能够访问通常不可见的精确输出逻辑值，而近期研究（例如Asiacrypt 2024, Eurocrypt 2025）则聚焦于硬标签设定，即攻击者仅能获取最终的分类结果（例如“狗”或“汽车”）。值得注意的是，Carlini等人（Eurocrypt 2025）证明了即使在这种受限设定下，模型提取在多项式时间内也是可行的。在本文中，我们首先指出，随着攻击目标网络深度的增加，其攻击所依赖的假设变得越来越不切实际。实际上，满足这些假设所需的查询次数相对于攻击深度呈指数增长，这意味着该攻击并非总是在多项式时间内运行。为了应对这一关键局限，我们提出了一种名为跨层提取的新型攻击方法。该方法并非直接提取特定神经元的秘密参数（例如权重和偏置）——这会导致指数级成本，而是利用跨层的神经元交互，从更深层中提取这些信息。该技术显著降低了查询复杂度，并缓解了现有模型提取方法的局限性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日