Machine learning systems increasingly rely on open-source artifacts such as datasets and models that are created or hosted by other parties. The reliance on external datasets and pre-trained models exposes the system to supply chain attacks where an artifact can be poisoned before it is delivered to the end-user. Such attacks are possible due to the lack of any authenticity verification in existing machine learning systems. Incorporating cryptographic solutions such as hashing and signing can mitigate the risk of supply chain attacks. However, existing frameworks for integrity verification based on cryptographic techniques can incur significant overhead when applied to state-of-the-art machine learning artifacts due to their scale, and are not compatible with GPU platforms. In this paper, we develop Sentry, a novel GPU-based framework that verifies the authenticity of machine learning artifacts by implementing cryptographic signing and verification for datasets and models. Sentry ties developer identities to signatures and performs authentication on the fly as artifacts are loaded on GPU memory, making it compatible with GPU data movement solutions such as NVIDIA GPUDirect that bypass the CPU. Sentry incorporates GPU acceleration of cryptographic hash constructions such as Merkle tree and lattice hashing, implementing memory optimizations and resource partitioning schemes for a high throughput performance. Our evaluations show that Sentry is a practical solution to bring authenticity to machine learning systems, achieving orders of magnitude speedup over a CPU-based baseline.
翻译:机器学习系统日益依赖由其他方创建或托管的开源工件,如数据集和模型。对外部数据集和预训练模型的依赖使系统面临供应链攻击的风险,即工件在交付给最终用户之前可能被投毒。此类攻击之所以可能发生,是因为现有机器学习系统缺乏任何真实性验证机制。采用哈希和签名等加密解决方案可以降低供应链攻击的风险。然而,基于加密技术的现有完整性验证框架应用于最先进的机器学习工件时,因其规模庞大可能产生显著开销,且与GPU平台不兼容。本文开发了Sentry,一种基于GPU的新型框架,通过对数据集和模型实施加密签名与验证来确保机器学习工件的真实性。Sentry将开发者身份与签名绑定,并在工件加载到GPU内存时实时执行认证,使其与绕过CPU的GPU数据传输解决方案(如NVIDIA GPUDirect)兼容。Sentry集成了Merkle树和格哈希等加密哈希结构的GPU加速,通过内存优化和资源分区方案实现高吞吐性能。我们的评估表明,Sentry是为机器学习系统提供真实性的实用解决方案,相比基于CPU的基线实现了数量级的速度提升。