Adversarial attacks can generate adversarial inputs by applying small but intentionally worst-case perturbations to samples from the dataset, which leads to even state-of-the-art deep neural networks outputting incorrect answers with high confidence. Hence, some adversarial defense techniques are developed to improve the security and robustness of the models and avoid them being attacked. Gradually, a game-like competition between attackers and defenders formed, in which both players would attempt to play their best strategies against each other while maximizing their own payoffs. To solve the game, each player would choose an optimal strategy against the opponent based on the prediction of the opponent's strategy choice. In this work, we are on the defensive side to apply game-theoretic approaches on defending against attacks. We use two randomization methods, random initialization and stochastic activation pruning, to create diversity of networks. Furthermore, we use one denoising technique, super resolution, to improve models' robustness by preprocessing images before attacks. Our experimental results indicate that those three methods can effectively improve the robustness of deep-learning neural networks.
翻译:对抗攻击可通过对数据集中样本施加微小但刻意设计的最坏情况扰动来生成对抗性输入,这种扰动甚至会使最先进的深度神经网络以高置信度输出错误答案。为此,研究者开发了多种对抗防御技术来提升模型的安全性和鲁棒性,避免模型遭受攻击。攻击者与防御者之间逐渐形成了一种博弈式的对抗关系,双方都试图在最大化自身收益的同时,针对对方部署最优策略。为求解该博弈,各方需基于对对手策略选择的预测,制定针对对方的最优策略。本研究从防御方视角出发,运用博弈论方法防御攻击:通过随机初始化和随机激活剪枝两种随机化方法构建网络多样性;再利用超分辨率这一去噪技术,在攻击前预处理图像以增强模型鲁棒性。实验结果表明,上述三种方法能有效提升深度神经网络的鲁棒性。