Training high-quality deep learning models is a challenging task due to computational and technical requirements. A growing number of individuals, institutions, and companies increasingly rely on pre-trained, third-party models made available in public repositories. These models are often used directly or integrated in product pipelines with no particular precautions, since they are effectively just data in tensor form and considered safe. In this paper, we raise awareness of a new machine learning supply chain threat targeting neural networks. We introduce MaleficNet 2.0, a novel technique to embed self-extracting, self-executing malware in neural networks. MaleficNet 2.0 uses spread-spectrum channel coding combined with error correction techniques to inject malicious payloads in the parameters of deep neural networks. MaleficNet 2.0 injection technique is stealthy, does not degrade the performance of the model, and is robust against removal techniques. We design our approach to work both in traditional and distributed learning settings such as Federated Learning, and demonstrate that it is effective even when a reduced number of bits is used for the model parameters. Finally, we implement a proof-of-concept self-extracting neural network malware using MaleficNet 2.0, demonstrating the practicality of the attack against a widely adopted machine learning framework. Our aim with this work is to raise awareness against these new, dangerous attacks both in the research community and industry, and we hope to encourage further research in mitigation techniques against such threats.
翻译:训练高质量的深度学习模型由于计算和技术要求极具挑战性。越来越多的个人、机构和公司依赖于公共存储库中提供的预训练第三方模型。这些模型通常被直接使用或集成到产品流水线中,而未采取特别预防措施,因为它们本质上只是张量形式的数据,被认为是安全的。在本文中,我们揭示了一种针对神经网络的新型机器学习供应链威胁。我们引入了MaleficNet 2.0,这是一种将自解压、自执行恶意软件嵌入神经网络的新技术。MaleficNet 2.0采用扩频信道编码结合纠错技术,将恶意载荷注入深度神经网络的参数中。该注入技术具有隐蔽性,不会降低模型性能,且对移除技术具有鲁棒性。我们的方法设计适用于传统学习设置和分布式学习设置(如联邦学习),并证明即使在模型参数使用减少位数的情况下也有效。最后,我们使用MaleficNet 2.0实现了一个概念验证的自解压神经网络恶意软件,展示了针对广泛采用的机器学习框架攻击的实用性。本研究的目的是在学术界和工业界提高对这些新型危险攻击的认识,并希望鼓励进一步研究针对此类威胁的缓解技术。