During the past decade, Deep Neural Networks (DNNs) proved their value on a large variety of subjects. However despite their high value and public accessibility, the protection of the intellectual property of DNNs is still an issue and an emerging research field. Recent works have successfully extracted fully-connected DNNs using cryptanalytic methods in hard-label settings, proving that it was possible to copy a DNN with high fidelity, i.e., high similitude in the output predictions. However, the current cryptanalytic attacks cannot target complex, i.e., not fully connected, DNNs and are limited to special cases of neurons present in deep networks. In this work, we introduce a new end-to-end attack framework designed for model extraction of embedded DNNs with high fidelity. We describe a new black-box side-channel attack which splits the DNN in several linear parts for which we can perform cryptanalytic extraction and retrieve the weights in hard-label settings. With this method, we are able to adapt cryptanalytic extraction, for the first time, to non-fully connected DNNs, while maintaining a high fidelity. We validate our contributions by targeting several architectures implemented on a microcontroller unit, including a Multi-Layer Perceptron (MLP) of 1.7 million parameters and a shortened MobileNetv1. Our framework successfully extracts all of these DNNs with high fidelity (88.4% for the MobileNetv1 and 93.2% for the MLP). Furthermore, we use the stolen model to generate adversarial examples and achieve close to white-box performance on the victim's model (95.8% and 96.7% transfer rate).
翻译:在过去的十年中,深度神经网络(DNNs)在众多领域证明了其价值。然而,尽管其价值巨大且可公开访问,保护DNNs的知识产权仍然是一个问题,并成为一个新兴的研究领域。近期研究已成功在硬标签设置下,利用密码分析方法提取了全连接DNNs,证明了以高保真度(即输出预测高度相似)复制DNN的可能性。然而,当前的密码分析攻击无法针对复杂的(即非全连接的)DNNs,并且仅限于深度网络中存在的特殊神经元类型。在本工作中,我们提出了一种新的端到端攻击框架,旨在对嵌入式DNNs进行高保真度的模型提取。我们描述了一种新的黑盒侧信道攻击,该攻击将DNN分割为若干线性部分,使我们能够在硬标签设置下对这些部分执行密码分析提取并恢复权重。通过这种方法,我们首次将密码分析提取技术成功应用于非全连接DNNs,同时保持了高保真度。我们通过在微控制器单元上实现多种架构(包括一个170万个参数的多层感知机(MLP)和一个缩短版的MobileNetv1)来验证我们的贡献。我们的框架成功提取了所有这些DNNs,并实现了高保真度(MobileNetv1为88.4%,MLP为93.2%)。此外,我们利用窃取的模型生成对抗样本,并在受害者模型上实现了接近白盒性能的攻击成功率(迁移率分别为95.8%和96.7%)。