Numerous mobile apps have leveraged deep learning capabilities. However, on-device models are vulnerable to attacks as they can be easily extracted from their corresponding mobile apps. Existing on-device attacking approaches only generate black-box attacks, which are far less effective and efficient than white-box strategies. This is because mobile deep learning frameworks like TFLite do not support gradient computing, which is necessary for white-box attacking algorithms. Thus, we argue that existing findings may underestimate the harmfulness of on-device attacks. To this end, we conduct a study to answer this research question: Can on-device models be directly attacked via white-box strategies? We first systematically analyze the difficulties of transforming the on-device model to its debuggable version, and propose a Reverse Engineering framework for On-device Models (REOM), which automatically reverses the compiled on-device TFLite model to the debuggable model. Specifically, REOM first transforms compiled on-device models into Open Neural Network Exchange format, then removes the non-debuggable parts, and converts them to the debuggable DL models format that allows attackers to exploit in a white-box setting. Our experimental results show that our approach is effective in achieving automated transformation among 244 TFLite models. Compared with previous attacks using surrogate models, REOM enables attackers to achieve higher attack success rates with a hundred times smaller attack perturbations. In addition, because the ONNX platform has plenty of tools for model format exchanging, the proposed method based on the ONNX platform can be adapted to other model formats. Our findings emphasize the need for developers to carefully consider their model deployment strategies, and use white-box methods to evaluate the vulnerability of on-device models.
翻译:众多移动应用已具备深度学习能力。然而,端侧模型极易遭受攻击,因其可从相应移动应用中被轻易提取。现有端侧攻击方法仅能生成黑盒攻击,其效果与效率远不及白盒策略。这是由于TFLite等移动深度学习框架不支持梯度计算,而梯度计算是白盒攻击算法的必要前提。因此,我们认为现有研究可能低估了端侧攻击的危害性。为此,我们开展了一项研究以回答以下研究问题:端侧模型能否通过白盒策略直接遭受攻击?我们首先系统分析了将端侧模型转化为可调试版本的难点,并提出面向端侧模型的逆向工程框架(REOM),该框架能自动将已编译的端侧TFLite模型逆向为可调试模型。具体而言,REOM首先将已编译的端侧模型转换为开放神经网络交换格式,随后移除不可调试部分,并将其转换为允许攻击者在白盒场景中利用的可调试深度学习模型格式。实验结果表明,我们的方法在244个TFLite模型上实现了有效的自动转换。与先前使用替代模型的攻击相比,REOM能使攻击者在攻击扰动缩小百倍的情况下实现更高攻击成功率。此外,由于ONNX平台拥有丰富的模型格式转换工具,基于ONNX平台提出的方法可适配其他模型格式。本研究发现强调,开发者需审慎考虑模型部署策略,并采用白盒方法评估端侧模型的脆弱性。