Robust Models Are More Interpretable Because Attributions Look Normal - 专知论文

会员服务 ·

0

稳健性 · 规范化的 · INFORMS · MoDELS · DeepLift模型 ·

2021 年 10 月 6 日

Robust Models Are More Interpretable Because Attributions Look Normal

翻译：强强型模型更易解释,因为归归属看正常

Zifan Wang,Matt Fredrikson,Anupam Datta

Recent work has found that adversarially-robust deep networks used for image classification are more interpretable: their feature attributions tend to be sharper, and are more concentrated on the objects associated with the image's ground-truth class. We show that smooth decision boundaries play an important role in this enhanced interpretability, as the model's input gradients around data points will more closely align with boundaries' normal vectors when they are smooth. Thus, because robust models have smoother boundaries, the results of gradient-based attribution methods, like Integrated Gradients and DeepLift, will capture more accurate information about nearby decision boundaries. This understanding of robust interpretability leads to our second contribution: \emph{boundary attributions}, which aggregate information about the normal vectors of local decision boundaries to explain a classification outcome. We show that by leveraging the key factors underpinning robust interpretability, boundary attributions produce sharper, more concentrated visual explanations -- even on non-robust models. Any example implementation can be found at \url{https://github.com/zifanw/boundary}.

翻译：最近的工作发现,用于图像分类的对抗- robust 深网络比较容易解释: 其特征属性往往更加清晰, 更集中于与图像地面真相类相关的对象。我们显示, 平稳的决定边界在这种增强解释性方面起着重要作用, 因为模型围绕数据点的输入梯度在数据点周围将更密切地与边界的正常矢量相匹配, 当它们平滑的时候。因此, 由于稳健的模型具有更平滑的边界, 以梯度为基础的归属方法, 如综合梯度和深海利夫特, 其结果将捕捉到关于附近决定界限的更准确的信息。这种对稳健可解释性的理解导致我们的第二个贡献 :\ emph{ 边际属性, 将关于本地决定边界的正常矢量的信息汇总, 解释结果。我们显示, 通过利用支撑稳健可解释性的关键因素, 边界属性产生更清晰、更集中的视觉解释 -- 甚至在非紫外线模型上。任何实例的实施可以在\ url{ http:// github.com/ fisterw/ briary} 中找到。

0

相关内容

稳健性

【CVPR2021教程】计算机视觉中的可解释机器学习

专知会员服务

64+阅读 · 2021年6月22日

【AAAI2021】可解释图胶囊网络物体检测

【AAAI2021】可解释图胶囊网络物体检测

专知会员服务

29+阅读 · 2021年1月4日

《可解释的机器学习-interpretable-ml》238页pdf

《可解释的机器学习-interpretable-ml》238页pdf

专知会员服务

210+阅读 · 2020年2月24日

可视化特征属性基线的影响，Visualizing the Impact of Feature Attribution Baselines

可视化特征属性基线的影响，Visualizing the Impact of Feature Attribution Baselines

专知会员服务

10+阅读 · 2020年1月16日

【IPAM workshops】加州大学洛杉矶分校会议：Geometry and Learning from Data in 3D and Beyond， workshop Ⅳ： Deep Geometric Learning of Big Data and Applications

【IPAM workshops】加州大学洛杉矶分校会议：Geometry and Learning from Data in 3D and Beyond， workshop Ⅳ： Deep Geometric Learning of Big Data and Applications

专知会员服务

20+阅读 · 2019年11月10日

【IPAM workshops】加州大学洛杉矶分校会议：Geometry and Learning from Data in 3D and Beyond，workshop Ⅲ：Geometry of Big Data

【IPAM workshops】加州大学洛杉矶分校会议：Geometry and Learning from Data in 3D and Beyond，workshop Ⅲ：Geometry of Big Data

专知会员服务

8+阅读 · 2019年11月10日

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

专知会员服务

32+阅读 · 2019年10月30日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

可解释的CNN

可解释的CNN

CreateAMind

18+阅读 · 2017年10月5日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Deformable ProtoPNet: An Interpretable Image Classifier Using Deformable Prototypes

Arxiv

0+阅读 · 2021年11月29日

Towards Rigorous Interpretations: a Formalisation of Feature Attribution

Arxiv

4+阅读 · 2021年4月26日

On Attribution of Recurrent Neural Network Predictions via Additive Decomposition

Arxiv

3+阅读 · 2019年3月27日

This Looks Like That: Deep Learning for Interpretable Image Recognition

Arxiv

5+阅读 · 2018年12月19日

Deformable ConvNets v2: More Deformable, Better Results

Deformable ConvNets v2: More Deformable, Better Results

Arxiv

4+阅读 · 2018年11月27日

Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer

Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer

Arxiv

3+阅读 · 2018年7月19日

Are Generative Classifiers More Robust to Adversarial Attacks?

Are Generative Classifiers More Robust to Adversarial Attacks?

Arxiv

4+阅读 · 2018年7月9日

Interpretable Active Learning

Interpretable Active Learning

Arxiv

3+阅读 · 2018年6月24日

Discrete Autoencoders for Sequence Models

Arxiv

6+阅读 · 2018年1月29日

Being Robust (in High Dimensions) Can Be Practical

Arxiv

3+阅读 · 2017年12月14日

VIP会员

文章信息

相关主题

最新内容

无人机自主控制与人工智能：系统性综述

无人机自主控制与人工智能：系统性综述

专知会员服务

5+阅读 · 今天7:25

巡飞弹与反无人机系统——现代战场的两大支柱

巡飞弹与反无人机系统——现代战场的两大支柱

专知会员服务

2+阅读 · 今天6:54

《打造“黄金舰队”》57页报告

《打造“黄金舰队”》57页报告

专知会员服务

1+阅读 · 今天6:52

《北约数字教官网络发展路径》128页报告

《北约数字教官网络发展路径》128页报告

专知会员服务

1+阅读 · 今天6:33

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

专知会员服务

6+阅读 · 6月25日

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

专知会员服务

5+阅读 · 6月25日

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

专知会员服务

9+阅读 · 6月25日

网状网络及其在军事领域的运用

网状网络及其在军事领域的运用

专知会员服务

7+阅读 · 6月25日

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

专知会员服务

8+阅读 · 6月25日

无美国参与的欧洲战争方式（万字长文）

无美国参与的欧洲战争方式（万字长文）

专知会员服务

8+阅读 · 6月25日

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

专知会员服务

10+阅读 · 6月25日

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

专知会员服务

9+阅读 · 6月25日

《国防领域敏感性分析白皮书》

《国防领域敏感性分析白皮书》

专知会员服务

9+阅读 · 6月25日

综述 | 从问答到任务完成：Agent系统与Harness设计

综述 | 从问答到任务完成：Agent系统与Harness设计

专知会员服务

10+阅读 · 6月24日

Agentic RL：框架、实践与长程智能体训练

Agentic RL：框架、实践与长程智能体训练

专知会员服务

10+阅读 · 6月24日

相关VIP内容

【CVPR2021教程】计算机视觉中的可解释机器学习

专知会员服务

64+阅读 · 2021年6月22日

【AAAI2021】可解释图胶囊网络物体检测

【AAAI2021】可解释图胶囊网络物体检测

专知会员服务

29+阅读 · 2021年1月4日

《可解释的机器学习-interpretable-ml》238页pdf

《可解释的机器学习-interpretable-ml》238页pdf

专知会员服务

210+阅读 · 2020年2月24日

可视化特征属性基线的影响，Visualizing the Impact of Feature Attribution Baselines

可视化特征属性基线的影响，Visualizing the Impact of Feature Attribution Baselines

专知会员服务

10+阅读 · 2020年1月16日

【IPAM workshops】加州大学洛杉矶分校会议：Geometry and Learning from Data in 3D and Beyond， workshop Ⅳ： Deep Geometric Learning of Big Data and Applications

【IPAM workshops】加州大学洛杉矶分校会议：Geometry and Learning from Data in 3D and Beyond， workshop Ⅳ： Deep Geometric Learning of Big Data and Applications

专知会员服务

20+阅读 · 2019年11月10日

【IPAM workshops】加州大学洛杉矶分校会议：Geometry and Learning from Data in 3D and Beyond，workshop Ⅲ：Geometry of Big Data

【IPAM workshops】加州大学洛杉矶分校会议：Geometry and Learning from Data in 3D and Beyond，workshop Ⅲ：Geometry of Big Data

专知会员服务

8+阅读 · 2019年11月10日

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

专知会员服务

32+阅读 · 2019年10月30日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

巡飞弹与反无人机系统——现代战场的两大支柱

《北约数字教官网络发展路径》128页报告

无人机自主控制与人工智能：系统性综述

《打造“黄金舰队”》57页报告

相关资讯

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

可解释的CNN

可解释的CNN

CreateAMind

18+阅读 · 2017年10月5日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Deformable ProtoPNet: An Interpretable Image Classifier Using Deformable Prototypes

Arxiv

0+阅读 · 2021年11月29日

Towards Rigorous Interpretations: a Formalisation of Feature Attribution

Arxiv

4+阅读 · 2021年4月26日

On Attribution of Recurrent Neural Network Predictions via Additive Decomposition

Arxiv

3+阅读 · 2019年3月27日

This Looks Like That: Deep Learning for Interpretable Image Recognition

Arxiv

5+阅读 · 2018年12月19日

Deformable ConvNets v2: More Deformable, Better Results

Deformable ConvNets v2: More Deformable, Better Results

Arxiv

4+阅读 · 2018年11月27日

Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer

Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer

Arxiv

3+阅读 · 2018年7月19日

Are Generative Classifiers More Robust to Adversarial Attacks?

Are Generative Classifiers More Robust to Adversarial Attacks?

Arxiv

4+阅读 · 2018年7月9日

Interpretable Active Learning

Interpretable Active Learning

Arxiv

3+阅读 · 2018年6月24日

Discrete Autoencoders for Sequence Models

Arxiv

6+阅读 · 2018年1月29日

Being Robust (in High Dimensions) Can Be Practical

Arxiv

3+阅读 · 2017年12月14日

微信扫码咨询专知VIP会员