Existing neural network verifiers compute a proof that each input is handled correctly under a given perturbation by propagating a symbolic abstraction of reachable values at each layer. This process is repeated from scratch independently for each input (e.g., image) and perturbation (e.g., rotation), leading to an expensive overall proof effort when handling an entire dataset. In this work, we introduce a new method for reducing this verification cost without losing precision based on a key insight that abstractions obtained at intermediate layers for different inputs and perturbations can overlap or contain each other. Leveraging our insight, we introduce the general concept of shared certificates, enabling proof effort reuse across multiple inputs to reduce overall verification costs. We perform an extensive experimental evaluation to demonstrate the effectiveness of shared certificates in reducing the verification cost on a range of datasets and attack specifications on image classifiers including the popular patch and geometric perturbations. We release our implementation at https://github.com/eth-sri/proof-sharing.
翻译:现有神经网络验证器通过逐层传播可达值的符号抽象,为每个输入在给定扰动下的正确性计算证明。这一过程需对每个输入(如图像)及扰动(如旋转)独立重复进行,导致处理整个数据集时产生高昂的整体证明开销。本文提出一种在不损失精度的前提下降低验证成本的新方法,其核心洞察在于:不同输入和扰动在中间层获得的抽象存在重叠或包含关系。基于此发现,我们引入共享证明的通用概念,实现跨多个输入的证明复用,从而降低整体验证成本。我们进行了大量实验评估,证明共享证明在图像分类器(包括流行的补丁攻击与几何扰动)的多个数据集和攻击规范下,能有效降低验证成本。我们将实现代码开源在https://github.com/eth-sri/proof-sharing。