Identifying Adversarially Attackable and Robust Samples - 专知论文

会员服务 ·

0

攻击 · 抗攻击 · 样本 · 对抗 · 对抗攻击 ·

2023 年 3 月 27 日

Identifying Adversarially Attackable and Robust Samples

翻译：识别可被对抗攻击与鲁棒的样本

Vyas Raina,Mark Gales

Adversarial attacks insert small, imperceptible perturbations to input samples that cause large, undesired changes to the output of deep learning models. Despite extensive research on generating adversarial attacks and building defense systems, there has been limited research on understanding adversarial attacks from an input-data perspective. This work introduces the notion of sample attackability, where we aim to identify samples that are most susceptible to adversarial attacks (attackable samples) and conversely also identify the least susceptible samples (robust samples). We propose a deep-learning-based method to detect the adversarially attackable and robust samples in an unseen dataset for an unseen target model. Experiments on standard image classification datasets enables us to assess the portability of the deep attackability detector across a range of architectures. We find that the deep attackability detector performs better than simple model uncertainty-based measures for identifying the attackable/robust samples. This suggests that uncertainty is an inadequate proxy for measuring sample distance to a decision boundary. In addition to better understanding adversarial attack theory, it is found that the ability to identify the adversarially attackable and robust samples has implications for improving the efficiency of sample-selection tasks, e.g. active learning in augmentation for adversarial training.

翻译：对抗攻击通过在输入样本中注入微小、不易察觉的扰动，导致深度学习模型输出产生巨大且非期望的变化。尽管在生成对抗攻击和构建防御系统方面已有大量研究，但从输入数据角度理解对抗攻击的研究仍十分有限。本文引入样本可攻击性的概念，旨在识别最易受对抗攻击影响的样本（可攻击样本），反之亦能识别最不易受影响的样本（鲁棒样本）。我们提出一种基于深度学习的方法，用于在未见过的数据集中针对未见过的目标模型检测可对抗攻击与鲁棒的样本。在标准图像分类数据集上的实验使我们能够评估深度可攻击性检测器在多种架构上的可移植性。研究发现，与基于模型不确定性的简单度量相比，深度可攻击性检测器在识别可攻击/鲁棒样本方面表现更优。这表明不确定性并非衡量样本到决策边界距离的充分代理。除加深对对抗攻击理论的理解外，识别可对抗攻击与鲁棒样本的能力对于提升样本选择任务（例如对抗训练中增强模块的主动学习）的效率具有实际意义。

0

相关内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【CVPR 2022】可转移的稀疏对抗性攻击，Transferable Sparse Adversarial Attack

【CVPR 2022】可转移的稀疏对抗性攻击，Transferable Sparse Adversarial Attack

专知会员服务

15+阅读 · 2022年3月12日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

47+阅读 · 2020年10月31日

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

专知会员服务

26+阅读 · 2020年7月24日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

专知会员服务

22+阅读 · 2020年4月16日

【Nature论文】用于理解图像分类决策和改进神经网络鲁棒性的对抗性解释（Adversarial Explanations for Understanding Image Classiﬁcation Decisions and Improved Neural Network Robustness ）

【Nature论文】用于理解图像分类决策和改进神经网络鲁棒性的对抗性解释（Adversarial Explanations for Understanding Image Classiﬁcation Decisions and Improved Neural Network Robustness ）

专知会员服务

14+阅读 · 2019年11月25日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

论文浅尝 | Language Models (Mostly) Know What They Know

论文浅尝 | Language Models (Mostly) Know What They Know

开放知识图谱

2+阅读 · 2022年11月18日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

【清华出品】NLP新方向文本对抗攻击与防御必读论文列表

【清华出品】NLP新方向文本对抗攻击与防御必读论文列表

专知

21+阅读 · 2019年7月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

全谱Förster型太阳能电池中的能量转移过程调控及其激子猝灭机制

国家自然科学基金

0+阅读 · 2015年12月31日

高声强下穿孔板覆盖多孔材料层结构的吸声机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于模型检测的非确定性概率模型学习

国家自然科学基金

2+阅读 · 2013年12月31日

非金属/过渡金属共掺杂的GaN纳米线可控制备及室温铁磁性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Tip60在oxLDL诱导的血管平滑肌细胞自噬及增殖中的作用机制

国家自然科学基金

0+阅读 · 2013年12月31日

基于WorldView-3和OP-ELM的矿化蚀变提取方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

一株含双降解质粒的红球菌（Rhodococcus sp.）二噁英降解机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

交通监控中面向行驶车辆的图像超分辨率重建方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

稀土掺杂钨酸盐下转换材料的制备及其在太阳能电池中的应用

国家自然科学基金

0+阅读 · 2011年12月31日

PS-DInSAR提取山区煤矿开采沉陷参数的方法与精度研究

国家自然科学基金

0+阅读 · 2011年12月31日

UOR: Universal Backdoor Attacks on Pre-trained Language Models

Arxiv

0+阅读 · 2023年5月16日

Generalized Kernel Two-Sample Tests

Arxiv

0+阅读 · 2023年5月14日

Diffusion Models for Imperceptible and Transferable Adversarial Attack

Arxiv

0+阅读 · 2023年5月14日

Bayesian high-dimensional covariate selection in non-linear mixed-effects models using the SAEM algorithm

Arxiv

0+阅读 · 2023年5月12日

Stratified Adversarial Robustness with Rejection

Arxiv

0+阅读 · 2023年5月12日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Arxiv

18+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

最新内容

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

专知会员服务

0+阅读 · 今天14:48

博士论文 | 从算法到基础模型：强化学习的统一视角

博士论文 | 从算法到基础模型：强化学习的统一视角

专知会员服务

0+阅读 · 今天14:46

面向国防作战的最佳自主与蜂群无人机技术

面向国防作战的最佳自主与蜂群无人机技术

专知会员服务

4+阅读 · 今天8:04

《异构人类团队的协作决策过程混合建模研究》

《异构人类团队的协作决策过程混合建模研究》

专知会员服务

4+阅读 · 今天7:59

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

专知会员服务

4+阅读 · 今天7:56

《设计思维中的人机协作：生成式人工智能对共情访谈影响的探究》140页

《设计思维中的人机协作：生成式人工智能对共情访谈影响的探究》140页

专知会员服务

4+阅读 · 今天7:50

博士论文 | 面向大模型推理的内存高效算法

博士论文 | 面向大模型推理的内存高效算法

专知会员服务

4+阅读 · 7月27日

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

专知会员服务

6+阅读 · 7月27日

《无人系统互操作性导论——无人系统联合架构（JAUS）》

《无人系统互操作性导论——无人系统联合架构（JAUS）》

专知会员服务

13+阅读 · 7月27日

美空军新型反无人机部队初探

美空军新型反无人机部队初探

专知会员服务

7+阅读 · 7月27日

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

专知会员服务

7+阅读 · 7月27日

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

专知会员服务

5+阅读 · 7月27日

《防空交战流程的概率建模研究》

《防空交战流程的概率建模研究》

专知会员服务

11+阅读 · 7月27日

ICML 2026 教程 | 数值优化理论还重要吗？

ICML 2026 教程 | 数值优化理论还重要吗？

专知会员服务

7+阅读 · 7月26日

ICM 2026 | 陶哲轩：人工智能时代的数学

ICM 2026 | 陶哲轩：人工智能时代的数学

专知会员服务

10+阅读 · 7月26日

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【CVPR 2022】可转移的稀疏对抗性攻击，Transferable Sparse Adversarial Attack

【CVPR 2022】可转移的稀疏对抗性攻击，Transferable Sparse Adversarial Attack

专知会员服务

15+阅读 · 2022年3月12日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

47+阅读 · 2020年10月31日

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

专知会员服务

26+阅读 · 2020年7月24日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

专知会员服务

22+阅读 · 2020年4月16日

【Nature论文】用于理解图像分类决策和改进神经网络鲁棒性的对抗性解释（Adversarial Explanations for Understanding Image Classiﬁcation Decisions and Improved Neural Network Robustness ）

【Nature论文】用于理解图像分类决策和改进神经网络鲁棒性的对抗性解释（Adversarial Explanations for Understanding Image Classiﬁcation Decisions and Improved Neural Network Robustness ）

专知会员服务

14+阅读 · 2019年11月25日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

博士论文 | 从算法到基础模型：强化学习的统一视角

《异构人类团队的协作决策过程混合建模研究》

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

面向国防作战的最佳自主与蜂群无人机技术

相关资讯

论文浅尝 | Language Models (Mostly) Know What They Know

论文浅尝 | Language Models (Mostly) Know What They Know

开放知识图谱

2+阅读 · 2022年11月18日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

【清华出品】NLP新方向文本对抗攻击与防御必读论文列表

【清华出品】NLP新方向文本对抗攻击与防御必读论文列表

专知

21+阅读 · 2019年7月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

UOR: Universal Backdoor Attacks on Pre-trained Language Models

Arxiv

0+阅读 · 2023年5月16日

Generalized Kernel Two-Sample Tests

Arxiv

0+阅读 · 2023年5月14日

Diffusion Models for Imperceptible and Transferable Adversarial Attack

Arxiv

0+阅读 · 2023年5月14日

Bayesian high-dimensional covariate selection in non-linear mixed-effects models using the SAEM algorithm

Arxiv

0+阅读 · 2023年5月12日

Stratified Adversarial Robustness with Rejection

Arxiv

0+阅读 · 2023年5月12日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Arxiv

18+阅读 · 2018年1月5日

相关基金

全谱Förster型太阳能电池中的能量转移过程调控及其激子猝灭机制

国家自然科学基金

0+阅读 · 2015年12月31日

高声强下穿孔板覆盖多孔材料层结构的吸声机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于模型检测的非确定性概率模型学习

国家自然科学基金

2+阅读 · 2013年12月31日

非金属/过渡金属共掺杂的GaN纳米线可控制备及室温铁磁性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Tip60在oxLDL诱导的血管平滑肌细胞自噬及增殖中的作用机制

国家自然科学基金

0+阅读 · 2013年12月31日

基于WorldView-3和OP-ELM的矿化蚀变提取方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

一株含双降解质粒的红球菌（Rhodococcus sp.）二噁英降解机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

交通监控中面向行驶车辆的图像超分辨率重建方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

稀土掺杂钨酸盐下转换材料的制备及其在太阳能电池中的应用

国家自然科学基金

0+阅读 · 2011年12月31日

PS-DInSAR提取山区煤矿开采沉陷参数的方法与精度研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员