How Does Attention Work in Vision Transformers? A Visual Analytics Attempt - 专知论文

会员服务 ·

0

视觉分析 · 视觉Transformer · Transformer · 结构化数据 · 分析 ·

2023 年 3 月 24 日

How Does Attention Work in Vision Transformers? A Visual Analytics Attempt

翻译：视觉Transformer中的注意力机制如何工作？一次可视化分析尝试

Yiran Li,Junpeng Wang,Xin Dai,Liang Wang,Chin-Chia Michael Yeh,Yan Zheng,Wei Zhang,Kwan-Liu Ma

from arxiv, Accepted by PacificVis 2023 and selected to be published in TVCG

Vision transformer (ViT) expands the success of transformer models from sequential data to images. The model decomposes an image into many smaller patches and arranges them into a sequence. Multi-head self-attentions are then applied to the sequence to learn the attention between patches. Despite many successful interpretations of transformers on sequential data, little effort has been devoted to the interpretation of ViTs, and many questions remain unanswered. For example, among the numerous attention heads, which one is more important? How strong are individual patches attending to their spatial neighbors in different heads? What attention patterns have individual heads learned? In this work, we answer these questions through a visual analytics approach. Specifically, we first identify what heads are more important in ViTs by introducing multiple pruning-based metrics. Then, we profile the spatial distribution of attention strengths between patches inside individual heads, as well as the trend of attention strengths across attention layers. Third, using an autoencoder-based learning solution, we summarize all possible attention patterns that individual heads could learn. Examining the attention strengths and patterns of the important heads, we answer why they are important. Through concrete case studies with experienced deep learning experts on multiple ViTs, we validate the effectiveness of our solution that deepens the understanding of ViTs from head importance, head attention strength, and head attention pattern.

翻译：视觉Transformer（ViT）将Transformer模型从序列数据成功扩展到图像领域。该模型将图像分解为多个较小的图像块，并将它们排列成序列。随后，多头自注意力机制被应用于该序列以学习图像块之间的注意力关系。尽管已有许多关于Transformer在序列数据上解释的成功案例，但对ViT解释的研究仍相对较少，许多问题尚待解答。例如，在众多注意力头中，哪些更为重要？在不同注意力头中，单个图像块对其空间邻域的注意力强度如何？各个注意力头学习了哪些注意力模式？本研究通过可视化分析方法回答了这些问题。具体而言，我们首先引入多种基于剪枝的度量指标，识别出ViT中更重要的注意力头；其次，刻画各注意力头内图像块之间注意力强度的空间分布及跨注意力层的强度变化趋势；第三，基于自编码器学习方法，归纳各个注意力头可能学习到的所有注意力模式。通过分析重要注意力头的注意力强度与模式，我们揭示了其重要性的原因。通过与多位资深深度学习专家在多类ViT模型上展开的具体案例研究，验证了我们方法的有效性——该方法从注意力头的重要性、注意力强度及注意力模式三个维度深化了对ViT的理解。

0

相关内容

视觉分析

CVPR 2023｜打破CAM的局限性！ToCo：进一步激发 ViT 在弱监督语义分割的潜力

CVPR 2023｜打破CAM的局限性！ToCo：进一步激发 ViT 在弱监督语义分割的潜力

专知会员服务

21+阅读 · 2023年3月31日

Transformer如何用于视频？最新「视频Transformer」2022综述

Transformer如何用于视频？最新「视频Transformer」2022综述

专知会员服务

76+阅读 · 2022年1月20日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

【IBM】在视觉和关系推理中迁移学习，Transfer Learning in Visual and Relational Reasoning

【IBM】在视觉和关系推理中迁移学习，Transfer Learning in Visual and Relational Reasoning

专知会员服务

45+阅读 · 2020年1月15日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

24+阅读 · 2019年12月15日

【文章|自注意力(self-attention)机制图解】《Illustrated: Self-Attention》by Raimi Karim

【文章|自注意力(self-attention)机制图解】《Illustrated: Self-Attention》by Raimi Karim

专知会员服务

45+阅读 · 2019年11月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

61+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

跨越注意力：Cross-Attention

跨越注意力：Cross-Attention

我爱读PAMI

172+阅读 · 2018年6月2日

【论文推荐】最新八篇生成对抗网络相关论文—条件翻译、RGB-D动作识别、量子生成对抗网络、语义对齐、视频摘要、视觉-文本注意力

【论文推荐】最新八篇生成对抗网络相关论文—条件翻译、RGB-D动作识别、量子生成对抗网络、语义对齐、视频摘要、视觉-文本注意力

专知

15+阅读 · 2018年5月15日

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

专知

18+阅读 · 2018年2月22日

水稻miR408耐冷功能与作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

LncRNA-00941在肝癌间充质干细胞调控肿瘤干细胞自我更新中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

心肌肥厚发病机制及其调控新靶点：lncRNAs

国家自然科学基金

0+阅读 · 2014年12月31日

miR17-92在肾脏缺血再灌注中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

缺氧诱导的microRNA调控TIP30表达促进胰腺癌转移作用机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

lncRNAH19介导NAT1基因甲基化改变对乳腺癌他莫昔芬耐药的作用及其分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Hippo-YAP信号通路相关基因遗传变异与肝癌预后的关系及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

14-3-3ζ 参与出核转运机制调控NF-kB活性影响复发急性髓系白血病细胞生长的分子机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

视觉注意的计算模型及其应用

国家自然科学基金

0+阅读 · 2012年12月31日

安慰剂效应及其可迁移性的认知神经科学机理

国家自然科学基金

0+阅读 · 2009年12月31日

Combining datasets to increase the number of samples and improve model fitting

Arxiv

0+阅读 · 2023年5月16日

Energy-Efficient URLLC Service Provision via a Near-Space Information Network

Arxiv

0+阅读 · 2023年5月16日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

45+阅读 · 2022年4月16日

Transformers Meet Visual Learning Understanding: A Comprehensive Review

Arxiv

28+阅读 · 2022年3月24日

A Survey on Vision Transformer

Arxiv

17+阅读 · 2022年2月23日

Attention Mechanisms in Computer Vision: A Survey

Arxiv

58+阅读 · 2021年11月15日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

A Survey on Visual Transformer

Arxiv

19+阅读 · 2020年12月23日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

Distance-based Self-Attention Network for Natural Language Inference

Arxiv

10+阅读 · 2017年12月6日

VIP会员

文章信息

相关主题

视觉Transformer

结构化数据

最新内容

博士论文 | 面向大模型推理的内存高效算法

博士论文 | 面向大模型推理的内存高效算法

专知会员服务

0+阅读 · 今天15:20

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

专知会员服务

0+阅读 · 今天15:18

《无人系统互操作性导论——无人系统联合架构（JAUS）》

《无人系统互操作性导论——无人系统联合架构（JAUS）》

专知会员服务

8+阅读 · 今天5:53

美空军新型反无人机部队初探

美空军新型反无人机部队初探

专知会员服务

4+阅读 · 今天5:45

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

专知会员服务

2+阅读 · 今天5:23

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

专知会员服务

2+阅读 · 今天5:11

《防空交战流程的概率建模研究》

《防空交战流程的概率建模研究》

专知会员服务

6+阅读 · 今天5:04

ICML 2026 教程 | 数值优化理论还重要吗？

ICML 2026 教程 | 数值优化理论还重要吗？

专知会员服务

4+阅读 · 7月26日

ICM 2026 | 陶哲轩：人工智能时代的数学

ICM 2026 | 陶哲轩：人工智能时代的数学

专知会员服务

8+阅读 · 7月26日

《面向可扩展高韧性无人机集群网络的速度感知分层通信框架》

《面向可扩展高韧性无人机集群网络的速度感知分层通信框架》

专知会员服务

8+阅读 · 7月26日

《面向概率推理的可定制战术引擎及其在军事任务规划中的应用》

《面向概率推理的可定制战术引擎及其在军事任务规划中的应用》

专知会员服务

10+阅读 · 7月26日

《先进防空系统选型战略框架：基于巴基斯坦的实证启示》

《先进防空系统选型战略框架：基于巴基斯坦的实证启示》

专知会员服务

8+阅读 · 7月26日

《反无人机交战场景下的战斗归零研究》

《反无人机交战场景下的战斗归零研究》

专知会员服务

7+阅读 · 7月26日

霍尔木兹与不对称作战时代：水雷、无人系统与海军力量的重新定义

霍尔木兹与不对称作战时代：水雷、无人系统与海军力量的重新定义

专知会员服务

4+阅读 · 7月26日

博士论文 | 用代码结构感知方法推进代码大模型

博士论文 | 用代码结构感知方法推进代码大模型

专知会员服务

5+阅读 · 7月25日

相关VIP内容

CVPR 2023｜打破CAM的局限性！ToCo：进一步激发 ViT 在弱监督语义分割的潜力

CVPR 2023｜打破CAM的局限性！ToCo：进一步激发 ViT 在弱监督语义分割的潜力

专知会员服务

21+阅读 · 2023年3月31日

Transformer如何用于视频？最新「视频Transformer」2022综述

Transformer如何用于视频？最新「视频Transformer」2022综述

专知会员服务

76+阅读 · 2022年1月20日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

【IBM】在视觉和关系推理中迁移学习，Transfer Learning in Visual and Relational Reasoning

【IBM】在视觉和关系推理中迁移学习，Transfer Learning in Visual and Relational Reasoning

专知会员服务

45+阅读 · 2020年1月15日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

24+阅读 · 2019年12月15日

【文章|自注意力(self-attention)机制图解】《Illustrated: Self-Attention》by Raimi Karim

【文章|自注意力(self-attention)机制图解】《Illustrated: Self-Attention》by Raimi Karim

专知会员服务

45+阅读 · 2019年11月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

61+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

美空军新型反无人机部队初探

博士论文 | 面向大模型推理的内存高效算法

《无人系统互操作性导论——无人系统联合架构（JAUS）》

相关资讯

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

跨越注意力：Cross-Attention

跨越注意力：Cross-Attention

我爱读PAMI

172+阅读 · 2018年6月2日

【论文推荐】最新八篇生成对抗网络相关论文—条件翻译、RGB-D动作识别、量子生成对抗网络、语义对齐、视频摘要、视觉-文本注意力

【论文推荐】最新八篇生成对抗网络相关论文—条件翻译、RGB-D动作识别、量子生成对抗网络、语义对齐、视频摘要、视觉-文本注意力

专知

15+阅读 · 2018年5月15日

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

专知

18+阅读 · 2018年2月22日

相关论文

Combining datasets to increase the number of samples and improve model fitting

Arxiv

0+阅读 · 2023年5月16日

Energy-Efficient URLLC Service Provision via a Near-Space Information Network

Arxiv

0+阅读 · 2023年5月16日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

45+阅读 · 2022年4月16日

Transformers Meet Visual Learning Understanding: A Comprehensive Review

Arxiv

28+阅读 · 2022年3月24日

A Survey on Vision Transformer

Arxiv

17+阅读 · 2022年2月23日

Attention Mechanisms in Computer Vision: A Survey

Arxiv

58+阅读 · 2021年11月15日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

A Survey on Visual Transformer

Arxiv

19+阅读 · 2020年12月23日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

Distance-based Self-Attention Network for Natural Language Inference

Arxiv

10+阅读 · 2017年12月6日

相关基金

水稻miR408耐冷功能与作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

LncRNA-00941在肝癌间充质干细胞调控肿瘤干细胞自我更新中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

心肌肥厚发病机制及其调控新靶点：lncRNAs

国家自然科学基金

0+阅读 · 2014年12月31日

miR17-92在肾脏缺血再灌注中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

缺氧诱导的microRNA调控TIP30表达促进胰腺癌转移作用机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

lncRNAH19介导NAT1基因甲基化改变对乳腺癌他莫昔芬耐药的作用及其分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Hippo-YAP信号通路相关基因遗传变异与肝癌预后的关系及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

14-3-3ζ 参与出核转运机制调控NF-kB活性影响复发急性髓系白血病细胞生长的分子机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

视觉注意的计算模型及其应用

国家自然科学基金

0+阅读 · 2012年12月31日

安慰剂效应及其可迁移性的认知神经科学机理

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员