In the last decade, a large body of work has emerged on robustness of neural networks, i.e., checking if the decision remains unchanged when the input is slightly perturbed. However, most of these approaches ignore the confidence of a neural network on its output. In this work, we aim to develop a generalized framework for formally reasoning about the confidence along with robustness in neural networks. We propose a simple yet expressive grammar that captures various confidence-based specifications. We develop a novel and unified technique to verify all instances of the grammar in a homogeneous way, viz., by adding a few additional layers to the neural network, which enables the use any state-of-the-art neural network verification tool. We perform an extensive experimental evaluation over a large suite of 8870 benchmarks, where the largest network has 138M parameters, and show that this outperforms ad-hoc encoding approaches by a significant margin.
翻译:过去十年间,涌现了大量关于神经网络鲁棒性的研究工作,即检验当输入受到轻微扰动时网络决策是否保持不变。然而,这些方法大多忽略了神经网络对其输出的置信度。本研究旨在建立一个通用框架,用于对神经网络的置信度与鲁棒性进行形式化推理。我们提出了一种简洁而富有表现力的语法,能够捕捉各类基于置信度的规范。我们开发了一种新颖的统一验证技术,能够以同质化方式验证该语法中的所有实例——具体而言,通过在神经网络中添加少量额外层,使得任何最先进的神经网络验证工具均可直接应用。我们在包含8870个测试基准的大规模实验套件上进行了广泛评估(其中最大网络参数量达1.38亿),结果表明该方法显著优于临时编码方案。