Adversarial attacks aim to perturb images such that a predictor outputs incorrect results. Due to the limited research in structured attacks, imposing consistency checks on natural multi-object scenes is a promising yet practical defense against conventional adversarial attacks. More desired attacks, to this end, should be able to fool defenses with such consistency checks. Therefore, we present the first approach GLOW that copes with various attack requests by generating global layout-aware adversarial attacks, in which both categorical and geometric layout constraints are explicitly established. Specifically, we focus on object detection task and given a victim image, GLOW first localizes victim objects according to target labels. And then it generates multiple attack plans, together with their context-consistency scores. Our proposed GLOW, on the one hand, is capable of handling various types of requests, including single or multiple victim objects, with or without specified victim objects. On the other hand, it produces a consistency score for each attack plan, reflecting the overall contextual consistency that both semantic category and global scene layout are considered. In experiment, we design multiple types of attack requests and validate our ideas on MS COCO and Pascal. Extensive experimental results demonstrate that we can achieve about 30$\%$ average relative improvement compared to state-of-the-art methods in conventional single object attack request; Moreover, our method outperforms SOTAs significantly on more generic attack requests by about 20$\%$ in average; Finally, our method produces superior performance under challenging zero-query black-box setting, or 20$\%$ better than SOTAs. Our code, model and attack requests would be made available.
翻译:对抗攻击旨在扰动图像,使预测器输出错误结果。由于结构化攻击研究有限,对自然多目标场景施加一致性检查是一种有前景且实用的抵御传统对抗攻击的防御手段。为此,更理想的攻击应能欺骗具备此类一致性检查的防御机制。因此,我们提出首个全局布局感知对抗攻击方法GLOW,通过显式建立类别与几何布局约束,应对多种攻击需求。具体而言,聚焦目标检测任务:给定受害图像,GLOW首先根据目标标签定位受害目标,随后生成多个攻击计划及其上下文一致性分数。一方面,GLOW能够处理多种攻击请求类型,包括单个或多个受害目标,以及是否指定特定受害目标;另一方面,它为每个攻击计划生成一致性分数,反映兼顾语义类别与全局场景布局的整体上下文一致性。实验中,我们设计多种攻击请求类型,并在MS COCO和Pascal数据集上验证思路。大量实验结果表明:与传统单目标攻击方法相比,我们平均相对提升约30%;在更通用的攻击请求下,平均性能超越现有最优方法约20%;此外,在具有挑战性的零查询黑盒设置下,性能比现有最优方法高约20%。我们的代码、模型及攻击请求将公开发布。