Investigating White-Box Attacks for On-Device Models

Numerous mobile apps have leveraged deep learning capabilities. However, on-device models are vulnerable to attacks as they can be easily extracted from their corresponding mobile apps. Existing on-device attacking approaches only generate black-box attacks, which are far less effective and efficient than white-box strategies. This is because mobile deep learning frameworks like TFLite do not support gradient computing, which is necessary for white-box attacking algorithms. Thus, we argue that existing findings may underestimate the harmfulness of on-device attacks. To this end, we conduct a study to answer this research question: Can on-device models be directly attacked via white-box strategies? We first systematically analyze the difficulties of transforming the on-device model to its debuggable version, and propose a Reverse Engineering framework for On-device Models (REOM), which automatically reverses the compiled on-device TFLite model to the debuggable model. Specifically, REOM first transforms compiled on-device models into Open Neural Network Exchange format, then removes the non-debuggable parts, and converts them to the debuggable DL models format that allows attackers to exploit in a white-box setting. Our experimental results show that our approach is effective in achieving automated transformation among 244 TFLite models. Compared with previous attacks using surrogate models, REOM enables attackers to achieve higher attack success rates with a hundred times smaller attack perturbations. In addition, because the ONNX platform has plenty of tools for model format exchanging, the proposed method based on the ONNX platform can be adapted to other model formats. Our findings emphasize the need for developers to carefully consider their model deployment strategies, and use white-box methods to evaluate the vulnerability of on-device models.

翻译：众多移动应用已利用深度学习能力。然而，设备端模型因其易于从相应移动应用中提取而面临攻击风险。现有设备端攻击方法仅生成黑盒攻击，其效果与效率远低于白盒策略。这是因为TFLite等移动深度学习框架不支持白盒攻击算法所需的梯度计算。因此，我们认为现有研究可能低估了设备端攻击的危害性。为此，我们开展研究以解答这一核心问题：设备端模型能否直接通过白盒策略进行攻击？我们首先系统分析了将设备端模型转化为可调试版本面临的难点，提出面向设备模型的反向工程框架（REOM），该框架能自动将编译后的设备端TFLite模型逆向为可调试模型。具体而言，REOM首先将编译后的设备端模型转换为开放神经网络交换格式，随后移除不可调试部分，最终将其转化为允许攻击者在白盒场景下利用的可调试深度学习模型格式。实验结果表明，该方法在244个TFLite模型上实现了有效的自动化转换。相较此前使用代理模型的攻击方式，REOM能使攻击者以缩小百倍的攻击扰动实现更高攻击成功率。此外，由于ONNX平台具备丰富的模型格式转换工具，基于该平台的方法可适配其他模型格式。我们的研究结果强调，开发者需要审慎设计模型部署策略，并采用白盒方法评估设备端模型的脆弱性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/