Neural networks are successful in various applications but are also susceptible to adversarial attacks. To show the safety of network classifiers, many verifiers have been introduced to reason about the local robustness of a given input to a given perturbation. While successful, local robustness cannot generalize to unseen inputs. Several works analyze global robustness properties, however, neither can provide a precise guarantee about the cases where a network classifier does not change its classification. In this work, we propose a new global robustness property for classifiers aiming at finding the minimal globally robust bound, which naturally extends the popular local robustness property for classifiers. We introduce VHAGaR, an anytime verifier for computing this bound. VHAGaR relies on three main ideas: encoding the problem as a mixed-integer programming and pruning the search space by identifying dependencies stemming from the perturbation or the network's computation and generalizing adversarial attacks to unknown inputs. We evaluate VHAGaR on several datasets and classifiers and show that, given a three hour timeout, the average gap between the lower and upper bound on the minimal globally robust bound computed by VHAGaR is 1.9, while the gap of an existing global robustness verifier is 154.7. Moreover, VHAGaR is 130.6x faster than this verifier. Our results further indicate that leveraging dependencies and adversarial attacks makes VHAGaR 78.6x faster.
翻译:神经网络在各种应用中取得了成功,但也容易受到对抗性攻击的影响。为了展示网络分类器的安全性,研究者们引入了许多验证器来推理给定输入在特定扰动下的局部鲁棒性。尽管这些方法有效,但局部鲁棒性无法泛化到未见输入。部分工作分析了全局鲁棒性属性,然而这些方法无法对网络分类器不改变分类结果的情况提供精确保证。本文提出了一种针对分类器的新型全局鲁棒性属性,旨在找到最小全局鲁棒边界,该属性自然推广了流行的分类器局部鲁棒性属性。我们引入了VHAGaR——一种用于计算该边界的任意时刻验证器。VHAGaR基于三个核心思想:将问题编码为混合整数规划、通过识别源于扰动或网络计算的依赖关系来剪枝搜索空间,以及将对抗性攻击推广到未知输入。我们在多个数据集和分类器上评估了VHAGaR,结果表明:在三小时超时限制下,VHAGaR计算的最小全局鲁棒边界下界与上界之间的平均差距为1.9,而现有全局鲁棒验证器的差距为154.7。此外,VHAGaR比该验证器快130.6倍。我们的结果进一步表明,利用依赖关系和对抗性攻击使VHAGaR的速度提升了78.6倍。