Nowadays, the deployment of deep learning-based applications is an essential task owing to the increasing demands on intelligent services. In this paper, we investigate latency attacks on deep learning applications. Unlike common adversarial attacks for misclassification, the goal of latency attacks is to increase the inference time, which may stop applications from responding to the requests within a reasonable time. This kind of attack is ubiquitous for various applications, and we use object detection to demonstrate how such kind of attacks work. We also design a framework named Overload to generate latency attacks at scale. Our method is based on a newly formulated optimization problem and a novel technique, called spatial attention. This attack serves to escalate the required computing costs during the inference time, consequently leading to an extended inference time for object detection. It presents a significant threat, especially to systems with limited computing resources. We conducted experiments using YOLOv5 models on Nvidia NX. Compared to existing methods, our method is simpler and more effective. The experimental results show that with latency attacks, the inference time of a single image can be increased ten times longer in reference to the normal setting. Moreover, our findings pose a potential new threat to all object detection tasks requiring non-maximum suppression (NMS), as our attack is NMS-agnostic.
翻译:如今,由于对智能服务的需求日益增长,基于深度学习的应用部署已成为一项关键任务。本文研究了针对深度学习应用的延迟攻击。与常见的旨在导致分类错误的对抗性攻击不同,延迟攻击的目标是增加推理时间,这可能导致应用无法在合理时间内响应请求。此类攻击普遍存在于各类应用中,我们以目标检测为例演示其工作原理。同时,我们设计了名为Overload的框架,用于大规模生成延迟攻击。该方法基于新提出的优化问题与一种名为空间注意力(spatial attention)的创新技术。该攻击通过提升推理阶段所需的计算成本,进而延长目标检测的推理时间。对于计算资源有限的系统而言,这构成了重大威胁。我们使用Nvidia NX设备上的YOLOv5模型进行了实验。与现有方法相比,我们的方法更简单且更有效。实验结果表明,在延迟攻击下,单张图像的推理时间相较于正常设置可延长十倍。此外,我们的发现对所有需要非极大值抑制(NMS)的目标检测任务均构成潜在新威胁,因为我们的攻击与NMS无关。