Deep neural networks are vulnerable to adversarial attacks. Most $L_{0}$-norm based white-box attacks craft perturbations by the gradient of models to the input. Since the computation cost and memory limitation of calculating the Hessian matrix, the application of Hessian or approximate Hessian in white-box attacks is gradually shelved. In this work, we note that the sparsity requirement on perturbations naturally lends itself to the usage of Hessian information. We study the attack performance and computation cost of the attack method based on the Hessian with a limited number of perturbation pixels. Specifically, we propose the Limited Pixel BFGS (LP-BFGS) attack method by incorporating the perturbation pixel selection strategy and the BFGS algorithm. Pixels with top-k attribution scores calculated by the Integrated Gradient method are regarded as optimization variables of the LP-BFGS attack. Experimental results across different networks and datasets demonstrate that our approach has comparable attack ability with reasonable computation in different numbers of perturbation pixels compared with existing solutions.
翻译:深度神经网络易受对抗攻击侵害。大多数基于$L_{0}$范数的白盒攻击通过模型对输入的梯度来构建扰动。由于计算Hessian矩阵的计算成本和内存限制,Hessian或近似Hessian在白盒攻击中的应用逐渐被搁置。本研究发现,扰动的稀疏性要求天然适合利用Hessian信息。我们研究了基于有限扰动像素的Hessian攻击方法的攻击性能与计算成本。具体而言,通过结合扰动像素选择策略与BFGS算法,我们提出了有限像素BFGS(LP-BFGS)攻击方法。通过积分梯度方法计算出的前k个贡献度分数最高的像素被视为LP-BFGS攻击的优化变量。在不同网络与数据集上的实验结果表明,与现有方案相比,我们的方法在扰动像素数量不同时,能以合理的计算开销达到相当的攻击能力。