Compared to traditional neural networks with a single exit, a multi-exit network has multiple exits that allow for early output from intermediate layers of the model, thus bringing significant improvement in computational efficiency while maintaining similar recognition accuracy. When attempting to steal such valuable models using traditional model stealing attacks, we found that conventional methods can only steal the model's classification function while failing to capture its output strategy. This results in a significant decrease in computational efficiency for the stolen substitute model, thereby losing the advantages of multi-exit networks.In this paper, we propose the first model stealing attack to extract both the model function and output strategy. We employ bayesian changepoint detection to analyze the target model's output strategy and use performance loss and strategy loss to guide the training of the substitute model. Furthermore, we designed a novel output strategy search algorithm that can find the optimal output strategy to maximize the consistency between the victim model and the substitute model's outputs. Through experiments on multiple mainstream multi-exit networks and benchmark datasets, we thoroughly demonstrates the effectiveness of our method.
翻译:与仅具有单一出口的传统神经网络相比,多出口网络设有多个出口,使得模型中间层能够提前输出结果,从而在保持相近识别精度的同时显著提升计算效率。当尝试采用传统模型窃取攻击来窃取这类有价值模型时,我们发现传统方法仅能窃取模型的分类功能,却无法捕获其输出策略。这导致被窃取的替代模型计算效率大幅下降,从而丧失多出口网络的优势。本文首次提出了一种能够同时提取模型功能与输出策略的模型窃取攻击。我们采用贝叶斯变点检测分析目标模型的输出策略,并利用性能损失与策略损失引导替代模型的训练。此外,我们设计了一种新颖的输出策略搜索算法,能够找到最优输出策略,以最大化受害模型与替代模型输出之间的一致性。通过在多个主流多出口网络及基准数据集上的实验,充分验证了本方法的有效性。