During the past decade, Deep Neural Networks (DNNs) proved their value on a large variety of subjects. However despite their high value and public accessibility, the protection of the intellectual property of DNNs is still an issue and an emerging research field. Recent works have successfully extracted fully-connected DNNs using cryptanalytic methods in hard-label settings, proving that it was possible to copy a DNN with high fidelity, i.e., high similitude in the output predictions. However, the current cryptanalytic attacks cannot target complex, i.e., not fully connected, DNNs and are limited to special cases of neurons present in deep networks. In this work, we introduce a new end-to-end attack framework designed for model extraction of embedded DNNs with high fidelity. We describe a new black-box side-channel attack which splits the DNN in several linear parts for which we can perform cryptanalytic extraction and retrieve the weights in hard-label settings. With this method, we are able to adapt cryptanalytic extraction, for the first time, to non-fully connected DNNs, while maintaining a high fidelity. We validate our contributions by targeting several architectures implemented on a microcontroller unit, including a Multi-Layer Perceptron (MLP) of 1.7 million parameters and a shortened MobileNetv1. Our framework successfully extracts all of these DNNs with high fidelity (88.4% for the MobileNetv1 and 93.2% for the MLP). Furthermore, we use the stolen model to generate adversarial examples and achieve close to white-box performance on the victim's model (95.8% and 96.7% transfer rate).
翻译:在过去的十年中,深度神经网络(DNN)在众多领域证明了其价值。然而,尽管具有高价值和公开可访问性,DNN知识产权的保护仍是一个问题,也是一个新兴的研究领域。最近的研究成功地在硬标签设置下使用密码分析方法提取了全连接DNN,证明可以高保真度(即输出预测的高相似性)复制DNN。然而,当前的密码分析攻击无法针对复杂的(即非全连接)DNN,且仅限于深度网络中存在的特定神经元类型。在这项工作中,我们提出了一种新的端到端攻击框架,旨在高保真度提取嵌入式DNN。我们描述了一种新的黑盒侧信道攻击,该攻击将DNN分割成多个线性部分,从而可以在硬标签设置下对其执行密码分析提取并获取权重。通过这种方法,我们首次能够将密码分析提取方法应用于非全连接DNN,同时保持高保真度。我们通过针对在微控制器单元上实现的多种架构(包括具有170万参数的多层感知机(MLP)和精简版MobileNetv1)进行实验,验证了我们的贡献。我们的框架成功高保真度地提取了所有这些DNN(MobileNetv1为88.4%,MLP为93.2%)。此外,我们利用窃取的模型生成对抗样本,并在受害者模型上实现了接近白盒的性能(95.8%和96.7%的迁移率)。