Neural networks are vulnerable to adversarial attacks, i.e., small input perturbations can significantly affect the outputs of a neural network. Therefore, to ensure safety of safety-critical environments, the robustness of a neural network must be formally verified against input perturbations, e.g., from noisy sensors. To improve the robustness of neural networks and thus simplify the formal verification, we present a novel set-based training procedure in which we compute the set of possible outputs given the set of possible inputs and compute for the first time a gradient set, i.e., each possible output has a different gradient. Therefore, we can directly reduce the size of the output enclosure by choosing gradients toward its center. Small output enclosures increase the robustness of a neural network and, at the same time, simplify its formal verification. The latter benefit is due to the fact that a larger size of propagated sets increases the conservatism of most verification methods. Our extensive evaluation demonstrates that set-based training produces robust neural networks with competitive performance, which can be verified using fast (polynomial-time) verification algorithms due to the reduced output set.
翻译:神经网络易受对抗性攻击,即微小的输入扰动可能显著影响神经网络的输出。因此,为确保安全关键环境的安全性,必须针对输入扰动(例如来自噪声传感器的扰动)对神经网络的鲁棒性进行形式化验证。为提升神经网络的鲁棒性从而简化形式化验证过程,本文提出一种新颖的基于集合的训练方法:在给定输入集合的情况下计算可能的输出集合,并首次计算梯度集合——即每个可能输出对应不同梯度。通过选择朝向输出包络中心的梯度,可直接缩小输出包络的规模。较小的输出包络不仅能增强神经网络的鲁棒性,同时能简化其形式化验证。后者的优势源于大多数验证方法在传播集合规模增大时会产生更强的保守性。我们通过大量实验证明,基于集合的训练能生成具有竞争力的鲁棒神经网络,且由于输出集合的缩减,可使用快速(多项式时间)验证算法完成验证。