Deep learning technology has brought convenience and advanced developments but has become untrustworthy because of its sensitivity to inconspicuous perturbations (i.e., adversarial attacks). Attackers utilize this sensitivity to slightly manipulate transmitted messages. To defend against such attacks, we have devised a strategy for "attacking" the message before it is attacked. This strategy, dubbed Fast Preemption, provides an efficient transferable preemptive defense by using different models for labeling inputs and learning crucial features. A forward-backward cascade learning algorithm is used to compute protective perturbations, starting with forward propagation optimization to achieve rapid convergence, followed by iterative backward propagation learning to alleviate overfitting. This strategy offers state-of-the-art transferability and protection across various systems. With the running of only three steps, our Fast Preemption framework outperforms benchmark training-time, test-time, and preemptive adversarial defenses. We have also devised the first to our knowledge effective white-box adaptive reversion attack and demonstrate that the protection added by our defense strategy is irreversible unless the backbone model, algorithm, and settings are fully compromised. This work provides a new direction to developing active defenses against adversarial attacks.
翻译:深度学习技术带来了便利与先进发展,但由于其对细微扰动(即对抗攻击)的敏感性,已变得不可信赖。攻击者利用这种敏感性对传输信息进行轻微操控。为防御此类攻击,我们设计了一种在信息遭受攻击前主动“攻击”信息的策略。该策略称为快速抢占,通过使用不同模型进行输入标注与关键特征学习,提供了一种高效可迁移的抢占式防御。采用前向-后向级联学习算法计算防护性扰动:首先通过前向传播优化实现快速收敛,随后通过迭代后向传播学习缓解过拟合。该策略在多种系统中实现了最先进的可迁移性与防护性能。仅需运行三步,我们的快速抢占框架即超越了基准的训练时防御、测试时防御及抢占式对抗防御方法。我们还首次设计了有效的白盒自适应逆转攻击(据我们所知),并证明除非骨干模型、算法及设置完全被攻破,否则本防御策略所添加的防护具有不可逆性。本研究为开发针对对抗攻击的主动防御提供了新方向。