Neural networks are vulnerable to adversarial attacks, i.e., small input perturbations can result in substantially different outputs of a neural network. Safety-critical environments require neural networks that are robust against input perturbations. However, training and formally verifying robust neural networks is challenging. We address this challenge by employing, for the first time, a end-to-end set-based training procedure that trains robust neural networks for formal verification. Our training procedure drastically simplifies the subsequent formal robustness verification of the trained neural network. While previous research has predominantly focused on augmenting neural network training with adversarial attacks, our approach leverages set-based computing to train neural networks with entire sets of perturbed inputs. Moreover, we demonstrate that our set-based training procedure effectively trains robust neural networks, which are easier to verify. In many cases, set-based trained neural networks outperform neural networks trained with state-of-the-art adversarial attacks.
翻译:神经网络易受对抗性攻击的影响,即微小的输入扰动可能导致神经网络输出产生显著差异。安全关键环境要求神经网络对输入扰动具有鲁棒性,然而训练并形式化验证鲁棒神经网络极具挑战性。我们首次通过采用端到端基于集合的训练流程来应对这一挑战,该流程训练出适用于形式化验证的鲁棒神经网络。我们的训练流程显著简化了后续对训练后神经网络的形式化鲁棒性验证。此前研究主要聚焦于通过对抗性攻击增强神经网络训练,而我们的方法利用基于集合的计算来训练覆盖整个扰动输入集合的神经网络。此外,我们证明基于集合的训练流程能有效训练出更易验证的鲁棒神经网络。在多数情况下,基于集合训练的神经网络性能优于采用最先进对抗性攻击训练的神经网络。