Preemptively Pruning Clever-Hans Strategies in Deep Neural Networks - 专知论文

会员服务 ·

0

剪枝 · 机器学习模型 · 学习模型 · 深度神经网络 · 故障模型 ·

2023 年 4 月 12 日

Preemptively Pruning Clever-Hans Strategies in Deep Neural Networks

翻译：在深度神经网络中预先剪除“巧妙汉斯”策略

Lorenz Linhardt,Klaus-Robert Müller,Grégoire Montavon

from arxiv, 14 pages + supplement

Explainable AI has become a popular tool for validating machine learning models. Mismatches between the explained model's decision strategy and the user's domain knowledge (e.g. Clever Hans effects) have also been recognized as a starting point for improving faulty models. However, it is less clear what to do when the user and the explanation agree. In this paper, we demonstrate that acceptance of explanations by the user is not a guarantee for a ML model to function well, in particular, some Clever Hans effects may remain undetected. Such hidden flaws of the model can nevertheless be mitigated, and we demonstrate this by contributing a new method, Explanation-Guided Exposure Minimization (EGEM), that premptively prunes variations in the ML model that have not been the subject of positive explanation feedback. Experiments on natural image data demonstrate that our approach leads to models that strongly reduce their reliance on hidden Clever Hans strategies, and consequently achieve higher accuracy on new data.

翻译：可解释人工智能已成为验证机器学习模型的流行工具。解释模型的决策策略与用户领域知识之间的不匹配（例如“巧妙汉斯”效应）也被视为改进有缺陷模型的起点。然而，当用户与解释达成一致时，应如何行动尚不明确。本文证明，用户对解释的接受并不能保证机器学习模型正常运行——某些“巧妙汉斯”效应可能仍未被发现。尽管如此，模型的此类隐藏缺陷仍可得到缓解。我们通过提出新方法——解释引导的暴露最小化（EGEM）——展示了这一点，该方法能预先剪除未获得正面解释反馈的机器学习模型中的变异。自然图像数据实验表明，所提方法可显著减少模型对隐藏的“巧妙汉斯”策略的依赖，从而在新数据上获得更高的准确率。

0

相关内容

我们真的需要深度学习模型来预测时间序列吗? Do We Really Need Deep Learning Models for Time Series Forecasting?

我们真的需要深度学习模型来预测时间序列吗? Do We Really Need Deep Learning Models for Time Series Forecasting?

专知会员服务

37+阅读 · 2022年3月13日

【PKDD2020教程】可解释人工智能XAI:算法到应用，200页ppt

专知会员服务

41+阅读 · 2020年10月13日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

专知会员服务

26+阅读 · 2020年3月26日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

【Awesome】最全的机器学习可解释性资料（machine-learning-interpretability）

【Awesome】最全的机器学习可解释性资料（machine-learning-interpretability）

专知

29+阅读 · 2019年3月1日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

面向认知无线网络的频谱分配策略、随机模型理论及系统优化方法的研究

国家自然科学基金

0+阅读 · 2014年12月31日

两类迁移扩散方程组的若干问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

Ad hoc网络中基于博弈论的激励合作路由算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

雌激素/雌激素受体-RUNX1-miR-29家族-OX40、ICOS调控通路在记忆T细胞介导的移植免疫中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Ghrelin抑制糖脂毒性诱导的心肌胰岛素抵抗的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

TRAIL协同IER3调节NF-κB信号通路介导肝癌细胞凋亡的相关机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

AR/let-7及其下游分子对ER-AR+乳腺癌干细胞生长的调控机制

国家自然科学基金

0+阅读 · 2011年12月31日

网络控制系统中基于时延在线预测的动态调度策略研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于动力学分析的Internet网络拥塞控制研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

Few-Shot Continual Learning for Conditional Generative Adversarial Networks

Arxiv

0+阅读 · 2023年5月30日

Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks

Arxiv

0+阅读 · 2023年5月29日

Double-Weighting for Covariate Shift Adaptation

Arxiv

0+阅读 · 2023年5月27日

Understanding Sparse Feature Updates in Deep Networks using Iterative Linearisation

Arxiv

0+阅读 · 2023年5月26日

Neural networks trained with SGD learn distributions of increasing complexity

Arxiv

0+阅读 · 2023年5月26日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Recent advances in deep learning theory

Recent advances in deep learning theory

Arxiv

52+阅读 · 2020年12月20日

Dynamic Graph Neural Networks

Arxiv

24+阅读 · 2018年10月24日

How Powerful are Graph Neural Networks?

Arxiv

23+阅读 · 2018年10月1日

A Memory-Network Based Solution for Multivariate Time-Series Forecasting

A Memory-Network Based Solution for Multivariate Time-Series Forecasting

Arxiv

13+阅读 · 2018年9月6日

VIP会员

文章信息

相关主题

机器学习模型

深度神经网络

最新内容

《面向国防应用的无人机选型：一种对比性多模糊多准则决策框架》

《面向国防应用的无人机选型：一种对比性多模糊多准则决策框架》

专知会员服务

6+阅读 · 今天7:05

无人机战争：从乌克兰到中东战场的沙希德（Shahed）无人机分析

无人机战争：从乌克兰到中东战场的沙希德（Shahed）无人机分析

专知会员服务

4+阅读 · 今天6:51

为初级军官战术训练设计生成式人工智能平台

为初级军官战术训练设计生成式人工智能平台

专知会员服务

4+阅读 · 今天6:43

《美空军条令出版物 3-40，反大规模杀伤性武器作战》

《美空军条令出版物 3-40，反大规模杀伤性武器作战》

专知会员服务

3+阅读 · 今天6:40

《美军条令：作战伤员后送保障》

《美军条令：作战伤员后送保障》

专知会员服务

4+阅读 · 今天6:38

《美空军条令出版物 4-0，维持》

《美空军条令出版物 4-0，维持》

专知会员服务

3+阅读 · 今天6:32

《通过自然语言与强化学习奖励机制将军事条令与目标融入AI智能体》

《通过自然语言与强化学习奖励机制将军事条令与目标融入AI智能体》

专知会员服务

6+阅读 · 今天6:30

《基于DIJKSTRA最短路径算法在AFSIM框架中实现高效动态威胁规避路径规划》

《基于DIJKSTRA最短路径算法在AFSIM框架中实现高效动态威胁规避路径规划》

专知会员服务

3+阅读 · 今天6:25

《修正错误与改进设计：运用数据耕耘支持基于智能体的军事仿真模型验证与确认》

《修正错误与改进设计：运用数据耕耘支持基于智能体的军事仿真模型验证与确认》

专知会员服务

3+阅读 · 今天6:24

《基于仿真的空军任务规划优化》

《基于仿真的空军任务规划优化》

专知会员服务

3+阅读 · 今天6:21

《基于离散事件仿真的航空母舰舰载机出动架次生成分析》

《基于离散事件仿真的航空母舰舰载机出动架次生成分析》

专知会员服务

3+阅读 · 今天6:17

《基于语义分割与深度强化学习的战场环境战术路径规划》

《基于语义分割与深度强化学习的战场环境战术路径规划》

专知会员服务

5+阅读 · 今天6:14

ICML 2026 Oral｜大模型为何难被提示纠正？内部先验限制标注适应性

ICML 2026 Oral｜大模型为何难被提示纠正？内部先验限制标注适应性

专知会员服务

4+阅读 · 6月8日

CVPR 2026教程：统一多模态模型走向收敛之路

CVPR 2026教程：统一多模态模型走向收敛之路

专知会员服务

7+阅读 · 6月8日

《人工智能在网络防御中的机遇》

《人工智能在网络防御中的机遇》

专知会员服务

6+阅读 · 6月8日

相关VIP内容

我们真的需要深度学习模型来预测时间序列吗? Do We Really Need Deep Learning Models for Time Series Forecasting?

我们真的需要深度学习模型来预测时间序列吗? Do We Really Need Deep Learning Models for Time Series Forecasting?

专知会员服务

37+阅读 · 2022年3月13日

【PKDD2020教程】可解释人工智能XAI:算法到应用，200页ppt

专知会员服务

41+阅读 · 2020年10月13日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

专知会员服务

26+阅读 · 2020年3月26日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

无人机战争：从乌克兰到中东战场的沙希德（Shahed）无人机分析

《美空军条令出版物 3-40，反大规模杀伤性武器作战》

《面向国防应用的无人机选型：一种对比性多模糊多准则决策框架》

为初级军官战术训练设计生成式人工智能平台

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

【Awesome】最全的机器学习可解释性资料（machine-learning-interpretability）

【Awesome】最全的机器学习可解释性资料（machine-learning-interpretability）

专知

29+阅读 · 2019年3月1日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

相关论文

Few-Shot Continual Learning for Conditional Generative Adversarial Networks

Arxiv

0+阅读 · 2023年5月30日

Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks

Arxiv

0+阅读 · 2023年5月29日

Double-Weighting for Covariate Shift Adaptation

Arxiv

0+阅读 · 2023年5月27日

Understanding Sparse Feature Updates in Deep Networks using Iterative Linearisation

Arxiv

0+阅读 · 2023年5月26日

Neural networks trained with SGD learn distributions of increasing complexity

Arxiv

0+阅读 · 2023年5月26日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Recent advances in deep learning theory

Recent advances in deep learning theory

Arxiv

52+阅读 · 2020年12月20日

Dynamic Graph Neural Networks

Arxiv

24+阅读 · 2018年10月24日

How Powerful are Graph Neural Networks?

Arxiv

23+阅读 · 2018年10月1日

A Memory-Network Based Solution for Multivariate Time-Series Forecasting

A Memory-Network Based Solution for Multivariate Time-Series Forecasting

Arxiv

13+阅读 · 2018年9月6日

相关基金

面向认知无线网络的频谱分配策略、随机模型理论及系统优化方法的研究

国家自然科学基金

0+阅读 · 2014年12月31日

两类迁移扩散方程组的若干问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

Ad hoc网络中基于博弈论的激励合作路由算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

雌激素/雌激素受体-RUNX1-miR-29家族-OX40、ICOS调控通路在记忆T细胞介导的移植免疫中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Ghrelin抑制糖脂毒性诱导的心肌胰岛素抵抗的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

TRAIL协同IER3调节NF-κB信号通路介导肝癌细胞凋亡的相关机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

AR/let-7及其下游分子对ER-AR+乳腺癌干细胞生长的调控机制

国家自然科学基金

0+阅读 · 2011年12月31日

网络控制系统中基于时延在线预测的动态调度策略研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于动力学分析的Internet网络拥塞控制研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员