Deep neural networks (DNNs) are at the forefront of cutting-edge technology, and have been achieving remarkable performance in a variety of complex tasks. Nevertheless, their integration into safety-critical systems, such as in the aerospace or automotive domains, poses a significant challenge due to the threat of adversarial inputs: perturbations in inputs that might cause the DNN to make grievous mistakes. Multiple studies have demonstrated that even modern DNNs are susceptible to adversarial inputs, and this risk must thus be measured and mitigated to allow the deployment of DNNs in critical settings. Here, we present gRoMA (global Robustness Measurement and Assessment), an innovative and scalable tool that implements a probabilistic approach to measure the global categorial robustness of a DNN. Specifically, gRoMA measures the probability of encountering adversarial inputs for a specific output category. Our tool operates on pre-trained, black-box classification DNNs, and generates input samples belonging to an output category of interest. It measures the DNN's susceptibility to adversarial inputs around these inputs, and aggregates the results to infer the overall global categorial robustness of the DNN up to some small bounded statistical error. We evaluate our tool on the popular Densenet DNN model over the CIFAR10 dataset. Our results reveal significant gaps in the robustness of the different output categories. This experiment demonstrates the usefulness and scalability of our approach and its potential for allowing DNNs to be deployed within critical systems of interest.
翻译:深度神经网络(DNN)处于尖端技术的前沿,并在多种复杂任务中取得了显著性能。然而,将其集成到安全关键系统(如航空航天或汽车领域)中,因对抗性输入的威胁而面临重大挑战:输入中的扰动可能导致DNN犯下严重错误。多项研究表明,即使是现代DNN也容易受到对抗性输入的影响,因此必须测量并缓解这一风险,以便在关键场景中部署DNN。本文提出gRoMA(全局鲁棒性测量与评估),一种创新且可扩展的工具,通过概率方法测量DNN的全局类别鲁棒性。具体而言,gRoMA测量在特定输出类别下遭遇对抗性输入的概率。该工具可处理预训练的黑盒分类DNN,生成属于目标输出类别的输入样本,测量DNN在这些样本附近的对抗性输入敏感性,并聚合结果以推断DNN的整体全局类别鲁棒性(误差控制在小的有界统计范围内)。我们在CIFAR10数据集上使用流行的Densenet DNN模型评估了该工具。实验结果揭示了不同输出类别之间鲁棒性的显著差异,验证了本方法的实用性和可扩展性,及其在关键系统中部署DNN的潜力。