Randomness supports many critical functions in the field of machine learning (ML) including optimisation, data selection, privacy, and security. ML systems outsource the task of generating or harvesting randomness to the compiler, the cloud service provider or elsewhere in the toolchain. Yet there is a long history of attackers exploiting poor randomness, or even creating it -- as when the NSA put backdoors in random number generators to break cryptography. In this paper we consider whether attackers can compromise an ML system using only the randomness on which they commonly rely. We focus our effort on Randomised Smoothing, a popular approach to train certifiably robust models, and to certify specific input datapoints of an arbitrary model. We choose Randomised Smoothing since it is used for both security and safety -- to counteract adversarial examples and quantify uncertainty respectively. Under the hood, it relies on sampling Gaussian noise to explore the volume around a data point to certify that a model is not vulnerable to adversarial examples. We demonstrate an entirely novel attack, where an attacker backdoors the supplied randomness to falsely certify either an overestimate or an underestimate of robustness for up to 81 times. We demonstrate that such attacks are possible, that they require very small changes to randomness to succeed, and that they are hard to detect. As an example, we hide an attack in the random number generator and show that the randomness tests suggested by NIST fail to detect it. We advocate updating the NIST guidelines on random number testing to make them more appropriate for safety-critical and security-critical machine-learning applications.
翻译:随机性支撑着机器学习领域的多项关键功能,包括优化、数据选择、隐私保护及安全性。机器学习系统将生成或采集随机性的任务外包给编译器、云服务提供商或工具链中的其他环节。然而,攻击者长期利用随机性缺陷甚至人为制造缺陷——例如美国国家安全局(NSA)在随机数生成器中植入后门以破解密码学。本文研究攻击者能否仅通过机器学习系统普遍依赖的随机性来破坏该系统。我们聚焦于随机平滑(Randomised Smoothing)这一主流方法,它被用于训练可认证鲁棒模型,或对任意模型的特定输入数据点进行认证。之所以选择随机平滑,是因为它同时服务于安全与保障目标——分别用于对抗对抗样本和量化不确定性。其内在机制依赖高斯噪声采样来探索数据点周围的体积空间,以认证模型对对抗样本的鲁棒性。我们提出一种全新的攻击方式:攻击者对提供的随机性植入后门,从而虚假认证鲁棒性被高估或低估高达81倍。研究表明此类攻击具备可行性,仅需对随机性进行极小改动即可成功,且极难被检测。例如,我们可将攻击隐藏在随机数生成器中,而美国国家标准与技术研究院(NIST)推荐的随机性测试未能检测到该攻击。我们建议更新NIST关于随机数测试的指南,使其更适用于安全关键型与安保关键型机器学习应用。