Neural network (NN) verification aims to formally verify properties of NNs, which is crucial for ensuring the behavior of NN-based models in safety-critical applications. In recent years, the community has developed many NN verifiers and benchmarks to evaluate them. However, existing benchmarks typically lack ground-truth for hard instances where no current verifier can verify the property and no counterexample can be found. This makes it difficult to validate the soundness of a verifier, when it claims verification on such challenging instances that no other verifier can handle. In this work, we develop a new benchmark for NN verification, named "SoundnessBench", specifically for testing the soundness of NN verifiers. SoundnessBench consists of instances with deliberately inserted counterexamples that are hidden from adversarial attacks commonly used to find counterexamples. Thereby, it can identify false verification claims when hidden counterexamples are known to exist. We design a training method to produce NNs with hidden counterexamples and systematically construct our SoundnessBench with instances across various model architectures, activation functions, and input data. We demonstrate that our training effectively produces hidden counterexamples and our SoundnessBench successfully identifies bugs in state-of-the-art NN verifiers. Our code is available at https://github.com/MVP-Harry/SoundnessBench and our benchmark is available at https://huggingface.co/datasets/SoundnessBench/SoundnessBench.
翻译:神经网络(NN)验证旨在形式化验证神经网络的属性,这对于确保基于神经网络的模型在安全关键应用中的行为至关重要。近年来,研究社区已开发出许多神经网络验证器及用于评估它们的基准。然而,现有基准通常缺乏对于困难实例的真实标注,这些实例是目前任何验证器都无法验证属性且无法找到反例的情况。这使得当验证器声称能够处理其他验证器无法应对的此类挑战性实例时,难以验证其可靠性。在本工作中,我们开发了一个名为“SoundnessBench”的神经网络验证新基准,专门用于测试神经网络验证器的可靠性。SoundnessBench包含一系列经过精心设计、植入了反例的实例,这些反例对常用于寻找反例的对抗性攻击具有隐蔽性。因此,当已知存在隐藏反例时,该基准能够识别错误的验证声明。我们设计了一种训练方法,以生成具有隐藏反例的神经网络,并系统性地构建了SoundnessBench,涵盖不同模型架构、激活函数和输入数据的实例。我们证明了我们的训练方法能有效产生隐藏反例,且SoundnessBench成功识别了当前最先进神经网络验证器中的缺陷。我们的代码可在https://github.com/MVP-Harry/SoundnessBench获取,基准数据集可在https://huggingface.co/datasets/SoundnessBench/SoundnessBench访问。