Do You Trust Your Model? Emerging Malware Threats in the Deep Learning Ecosystem

Training high-quality deep learning models is a challenging task due to computational and technical requirements. A growing number of individuals, institutions, and companies increasingly rely on pre-trained, third-party models made available in public repositories. These models are often used directly or integrated in product pipelines with no particular precautions, since they are effectively just data in tensor form and considered safe. In this paper, we raise awareness of a new machine learning supply chain threat targeting neural networks. We introduce MaleficNet 2.0, a novel technique to embed self-extracting, self-executing malware in neural networks. MaleficNet 2.0 uses spread-spectrum channel coding combined with error correction techniques to inject malicious payloads in the parameters of deep neural networks. MaleficNet 2.0 injection technique is stealthy, does not degrade the performance of the model, and is robust against removal techniques. We design our approach to work both in traditional and distributed learning settings such as Federated Learning, and demonstrate that it is effective even when a reduced number of bits is used for the model parameters. Finally, we implement a proof-of-concept self-extracting neural network malware using MaleficNet 2.0, demonstrating the practicality of the attack against a widely adopted machine learning framework. Our aim with this work is to raise awareness against these new, dangerous attacks both in the research community and industry, and we hope to encourage further research in mitigation techniques against such threats.

翻译：训练高质量的深度学习模型由于计算和技术要求极具挑战性。越来越多的个人、机构和公司依赖于公共存储库中提供的预训练第三方模型。这些模型通常被直接使用或集成到产品流水线中，而未采取特别预防措施，因为它们本质上只是张量形式的数据，被认为是安全的。在本文中，我们揭示了一种针对神经网络的新型机器学习供应链威胁。我们引入了MaleficNet 2.0，这是一种将自解压、自执行恶意软件嵌入神经网络的新技术。MaleficNet 2.0采用扩频信道编码结合纠错技术，将恶意载荷注入深度神经网络的参数中。该注入技术具有隐蔽性，不会降低模型性能，且对移除技术具有鲁棒性。我们的方法设计适用于传统学习设置和分布式学习设置（如联邦学习），并证明即使在模型参数使用减少位数的情况下也有效。最后，我们使用MaleficNet 2.0实现了一个概念验证的自解压神经网络恶意软件，展示了针对广泛采用的机器学习框架攻击的实用性。本研究的目的是在学术界和工业界提高对这些新型危险攻击的认识，并希望鼓励进一步研究针对此类威胁的缓解技术。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/