When considering real-world adversarial settings, defenders are unlikely to have access to the full range of deployment-time adversaries during training, and adversaries are likely to use realistic adversarial distortions that will not be limited to small L_p-constrained perturbations. To narrow in on this discrepancy between research and reality we introduce eighteen novel adversarial attacks, which we use to create ImageNet-UA, a new benchmark for evaluating model robustness against a wide range of unforeseen adversaries. We make use of our benchmark to identify a range of defense strategies which can help overcome this generalization gap, finding a rich space of techniques which can improve unforeseen robustness. We hope the greater variety and realism of ImageNet-UA will make it a useful tool for those working on real-world worst-case robustness, enabling development of more robust defenses which can generalize beyond attacks seen during training.
翻译:在考虑真实世界的对抗性场景时,防御方在训练期间通常无法获知部署时将遇到的全部对手类型,而对手很可能使用不受限于小L_p范数约束扰动、逼真的对抗性畸变方法。为缩小研究现状与现实需求的差距,我们引入了十八种新型对抗性攻击,据此创建了ImageNet-UA基准测试集,用于评估模型面对广泛未知对手的鲁棒性。我们利用该基准识别出若干有助于克服这一泛化差距的防御策略,发现了能提升未知鲁棒性的丰富技术空间。期待ImageNet-UA的多样性与逼真度能为现实世界最坏情况鲁棒性研究者提供有效工具,助力开发能超越训练阶段所见攻击类型的更强健防御方法。