Deep learning technology has brought convenience and advanced developments but has become untrustworthy because of its sensitivity to inconspicuous perturbations (i.e., adversarial attacks). Attackers may utilize this sensitivity to manipulate predictions. To defend against such attacks, we have devised a proactive strategy for "attacking" the medias before it is attacked by the third party, so that when the protected medias are further attacked, the adversarial perturbations are automatically neutralized. This strategy, dubbed Fast Preemption, provides an efficient transferable preemptive defense by using different models for labeling inputs and learning crucial features. A forward-backward cascade learning algorithm is used to compute protective perturbations, starting with forward propagation optimization to achieve rapid convergence, followed by iterative backward propagation learning to alleviate overfitting. This strategy offers state-of-the-art transferability and protection across various systems. With the running of only three steps, our Fast Preemption framework outperforms benchmark training-time, test-time, and preemptive adversarial defenses. We have also devised the first to our knowledge effective white-box adaptive reversion attack and demonstrate that the protection added by our defense strategy is irreversible unless the backbone model, algorithm, and settings are fully compromised. This work provides a new direction to developing proactive defenses against adversarial attacks. The proposed methodology will be made available on GitHub.
翻译:深度学习技术带来了便利与先进发展,但由于其对细微扰动(即对抗攻击)的敏感性,已变得不可信赖。攻击者可能利用这种敏感性来操纵预测结果。为防御此类攻击,我们设计了一种主动策略,在媒体受到第三方攻击之前先对其进行"攻击",使得受保护媒体在后续遭受攻击时,对抗扰动能够被自动中和。这种被称为快速抢占的策略,通过使用不同模型进行输入标注和关键特征学习,实现了高效可迁移的抢占式防御。采用前向-后向级联学习算法计算保护性扰动:首先通过前向传播优化实现快速收敛,随后通过迭代后向传播学习缓解过拟合。该策略在多种系统中实现了最先进的可迁移性和保护效果。仅需运行三步,我们的快速抢占框架就超越了基准的训练时防御、测试时防御及抢占式对抗防御方法。我们还首次设计出有效的白盒自适应逆转攻击,并证明除非骨干模型、算法和设置完全泄露,否则本防御策略添加的保护具有不可逆性。这项工作为开发针对对抗攻击的主动防御提供了新方向。所提方法将在GitHub上开源发布。