One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training

Deep neural networks (DNNs) are widely deployed on real-world devices. Concerns regarding their security have gained great attention from researchers. Recently, a new weight modification attack called bit flip attack (BFA) was proposed, which exploits memory fault inject techniques such as row hammer to attack quantized models in the deployment stage. With only a few bit flips, the target model can be rendered useless as a random guesser or even be implanted with malicious functionalities. In this work, we seek to further reduce the number of bit flips. We propose a training-assisted bit flip attack, in which the adversary is involved in the training stage to build a high-risk model to release. This high-risk model, obtained coupled with a corresponding malicious model, behaves normally and can escape various detection methods. The results on benchmark datasets show that an adversary can easily convert this high-risk but normal model to a malicious one on victim's side by \textbf{flipping only one critical bit} on average in the deployment stage. Moreover, our attack still poses a significant threat even when defenses are employed. The codes for reproducing main experiments are available at \url{https://github.com/jianshuod/TBA}.

翻译：深度神经网络（DNNs）广泛应用于现实世界的设备中，其安全性问题已引起研究者的极大关注。近期，一种名为比特翻转攻击（BFA）的新型权重修改攻击被提出，该攻击利用行锤等内存故障注入技术，在部署阶段攻击量化模型。仅需翻转少量比特，目标模型就会沦为随机猜测器，甚至被植入恶意功能。本研究致力于进一步减少所需翻转的比特数量。我们提出了一种训练辅助的比特翻转攻击，攻击者通过参与训练阶段构建并发布一个高风险模型。该高风险模型与对应的恶意模型协同获取，其行为表现正常，能够绕过多种检测方法。基准数据集上的实验结果表明，攻击者在部署阶段平均仅需翻转一个关键比特，即可轻松将受害者侧的高风险正常模型转化为恶意模型。此外，即便采用防御措施，我们的攻击仍构成重大威胁。复现主要实验的代码已公开于\url{https://github.com/jianshuod/TBA}。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日