Generalized Relation Modeling for Transformer Tracking - 专知论文

会员服务 ·

0

交互 · 划分 · 搜索 · Transformer · 特征表示 ·

2023 年 4 月 10 日

Generalized Relation Modeling for Transformer Tracking

翻译：基于广义关系建模的Transformer跟踪方法

Shenyuan Gao,Chunluan Zhou,Jun Zhang

from arxiv, Accepted by CVPR 2023. Code and models are publicly available at https://github.com/Little-Podi/GRM

Compared with previous two-stream trackers, the recent one-stream tracking pipeline, which allows earlier interaction between the template and search region, has achieved a remarkable performance gain. However, existing one-stream trackers always let the template interact with all parts inside the search region throughout all the encoder layers. This could potentially lead to target-background confusion when the extracted feature representations are not sufficiently discriminative. To alleviate this issue, we propose a generalized relation modeling method based on adaptive token division. The proposed method is a generalized formulation of attention-based relation modeling for Transformer tracking, which inherits the merits of both previous two-stream and one-stream pipelines whilst enabling more flexible relation modeling by selecting appropriate search tokens to interact with template tokens. An attention masking strategy and the Gumbel-Softmax technique are introduced to facilitate the parallel computation and end-to-end learning of the token division module. Extensive experiments show that our method is superior to the two-stream and one-stream pipelines and achieves state-of-the-art performance on six challenging benchmarks with a real-time running speed.

翻译：与以往的双流跟踪器相比，近期提出的单流水线跟踪框架允许模板与搜索区域提前交互，取得了显著的性能提升。然而现有单流跟踪器始终让模板在所有编码器层中与搜索区域的全部部分进行交互，当提取的特征表征不够具有判别性时，这可能导致目标-背景混淆。为缓解此问题，我们提出一种基于自适应令牌划分的广义关系建模方法。该方法是对基于注意力的Transformer跟踪关系建模的广义化表述，既继承了先前双流与单流水线的优点，又可通过选择合适搜索令牌与模板令牌交互实现更灵活的关系建模。我们引入注意力掩码策略和Gumbel-Softmax技术，以促进令牌划分模块的并行计算与端到端学习。大量实验表明，我们的方法优于双流与单流水线，在六个具有挑战性的基准测试中以实时运行速度取得了最先进性能。

0

相关内容

用于识别任务的视觉 Transformer 综述

用于识别任务的视觉 Transformer 综述

专知会员服务

75+阅读 · 2023年2月25日

近期必读的六篇ICLR 2021【对比学习（CL）】相关论文和代码

专知会员服务

26+阅读 · 2021年3月2日

近期必读的五篇AAAI 2021【视频理解】相关论文和代码

专知会员服务

51+阅读 · 2021年1月19日

近期必读的五篇 NeurIPS 2020【三维点云分析】相关论文和代码

专知会员服务

29+阅读 · 2020年12月29日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

近期必读的9篇CVPR 2019【域自适应（Domain Adaptation）】相关论文和代码

近期必读的9篇CVPR 2019【域自适应（Domain Adaptation）】相关论文和代码

专知会员服务

62+阅读 · 2020年1月10日

必读的7篇IJCAI 2019【图神经网络（GNN）】相关论文-Part2

必读的7篇IJCAI 2019【图神经网络（GNN）】相关论文-Part2

专知会员服务

62+阅读 · 2020年1月10日

【AAAI2020】实体关系联合抽取的编码器-解码器结构的有效建模（ Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction）

【AAAI2020】实体关系联合抽取的编码器-解码器结构的有效建模（ Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction）

专知会员服务

53+阅读 · 2019年11月22日

六篇 EMNLP 2019【图神经网络(GNN)+NLP】相关论文

六篇 EMNLP 2019【图神经网络(GNN)+NLP】相关论文

专知会员服务

73+阅读 · 2019年11月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

深度学习自然语言处理

18+阅读 · 2020年5月22日

【论文】Awesome Relation Extraction Paper（关系抽取）（PART V）

【论文】Awesome Relation Extraction Paper（关系抽取）（PART V）

AINLP

38+阅读 · 2019年9月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Relation Networks for Object Detection 论文笔记

Relation Networks for Object Detection 论文笔记

统计学习与视觉计算组

16+阅读 · 2018年4月18日

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

专知

16+阅读 · 2018年2月13日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

有理映射的参数空间

国家自然科学基金

0+阅读 · 2013年12月31日

个体化医学中生物标记物预测能力的估计和推断

国家自然科学基金

2+阅读 · 2013年12月31日

老年性EB病毒阳性DLBCL致病基因的筛选及克隆异质性研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

智能摄像机传感网络分布式数据关联方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

面向嵌入式系统的虚拟化技术研究

国家自然科学基金

1+阅读 · 2009年12月31日

基于张量分解和非参量密度建模的偏微分方程目标跟踪研究

国家自然科学基金

0+阅读 · 2009年12月31日

激素暴露及易感SNPs位点与脑卒中风险的关联性研究

国家自然科学基金

0+阅读 · 2009年12月31日

蒙古族人群炎症和内皮功能标志与脑卒中发病关系的研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于随机图模型的蛋白质三级结构预测算法研究

国家自然科学基金

1+阅读 · 2008年12月31日

Generalizing Adam To Manifolds For Efficiently Training Transformers

Arxiv

1+阅读 · 2023年5月26日

MixFormerV2: Efficient Fully Transformer Tracking

Arxiv

0+阅读 · 2023年5月25日

Assessor360: Multi-sequence Network for Blind Omnidirectional Image Quality Assessment

Arxiv

0+阅读 · 2023年5月24日

SmartTrim: Adaptive Tokens and Parameters Pruning for Efficient Vision-Language Models

Arxiv

0+阅读 · 2023年5月24日

Predicting Token Impact Towards Efficient Vision Transformer

Arxiv

0+阅读 · 2023年5月24日

Full Stack Optimization of Transformer Inference: a Survey

Arxiv

19+阅读 · 2023年2月27日

Semantic Models for the First-stage Retrieval: A Comprehensive Review

Arxiv

20+阅读 · 2021年9月17日

Link Prediction on N-ary Relational Facts: A Graph-based Approach

Arxiv

13+阅读 · 2021年5月18日

Transformer Tracking

Arxiv

17+阅读 · 2021年3月29日

Adaptive Correlation Filters with Long-Term and Short-Term Memory for Object Tracking

Arxiv

11+阅读 · 2018年3月23日

VIP会员

文章信息

相关主题

最新内容

博士论文 | 面向大模型推理的内存高效算法

博士论文 | 面向大模型推理的内存高效算法

专知会员服务

0+阅读 · 今天15:20

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

专知会员服务

0+阅读 · 今天15:18

《无人系统互操作性导论——无人系统联合架构（JAUS）》

《无人系统互操作性导论——无人系统联合架构（JAUS）》

专知会员服务

8+阅读 · 今天5:53

美空军新型反无人机部队初探

美空军新型反无人机部队初探

专知会员服务

4+阅读 · 今天5:45

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

专知会员服务

2+阅读 · 今天5:23

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

专知会员服务

2+阅读 · 今天5:11

《防空交战流程的概率建模研究》

《防空交战流程的概率建模研究》

专知会员服务

6+阅读 · 今天5:04

ICML 2026 教程 | 数值优化理论还重要吗？

ICML 2026 教程 | 数值优化理论还重要吗？

专知会员服务

4+阅读 · 7月26日

ICM 2026 | 陶哲轩：人工智能时代的数学

ICM 2026 | 陶哲轩：人工智能时代的数学

专知会员服务

8+阅读 · 7月26日

《面向可扩展高韧性无人机集群网络的速度感知分层通信框架》

《面向可扩展高韧性无人机集群网络的速度感知分层通信框架》

专知会员服务

8+阅读 · 7月26日

《面向概率推理的可定制战术引擎及其在军事任务规划中的应用》

《面向概率推理的可定制战术引擎及其在军事任务规划中的应用》

专知会员服务

10+阅读 · 7月26日

《先进防空系统选型战略框架：基于巴基斯坦的实证启示》

《先进防空系统选型战略框架：基于巴基斯坦的实证启示》

专知会员服务

8+阅读 · 7月26日

《反无人机交战场景下的战斗归零研究》

《反无人机交战场景下的战斗归零研究》

专知会员服务

7+阅读 · 7月26日

霍尔木兹与不对称作战时代：水雷、无人系统与海军力量的重新定义

霍尔木兹与不对称作战时代：水雷、无人系统与海军力量的重新定义

专知会员服务

4+阅读 · 7月26日

博士论文 | 用代码结构感知方法推进代码大模型

博士论文 | 用代码结构感知方法推进代码大模型

专知会员服务

5+阅读 · 7月25日

相关VIP内容

用于识别任务的视觉 Transformer 综述

用于识别任务的视觉 Transformer 综述

专知会员服务

75+阅读 · 2023年2月25日

近期必读的六篇ICLR 2021【对比学习（CL）】相关论文和代码

专知会员服务

26+阅读 · 2021年3月2日

近期必读的五篇AAAI 2021【视频理解】相关论文和代码

专知会员服务

51+阅读 · 2021年1月19日

近期必读的五篇 NeurIPS 2020【三维点云分析】相关论文和代码

专知会员服务

29+阅读 · 2020年12月29日

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

【Google ICLR2020论文】嵌入式大规模检索的预训练任务，Pre-training Tasks for Embedding-based Large-scale Retrieval

专知会员服务

28+阅读 · 2020年2月12日

近期必读的9篇CVPR 2019【域自适应（Domain Adaptation）】相关论文和代码

近期必读的9篇CVPR 2019【域自适应（Domain Adaptation）】相关论文和代码

专知会员服务

62+阅读 · 2020年1月10日

必读的7篇IJCAI 2019【图神经网络（GNN）】相关论文-Part2

必读的7篇IJCAI 2019【图神经网络（GNN）】相关论文-Part2

专知会员服务

62+阅读 · 2020年1月10日

【AAAI2020】实体关系联合抽取的编码器-解码器结构的有效建模（ Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction）

【AAAI2020】实体关系联合抽取的编码器-解码器结构的有效建模（ Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction）

专知会员服务

53+阅读 · 2019年11月22日

六篇 EMNLP 2019【图神经网络(GNN)+NLP】相关论文

六篇 EMNLP 2019【图神经网络(GNN)+NLP】相关论文

专知会员服务

73+阅读 · 2019年11月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

美空军新型反无人机部队初探

博士论文 | 面向大模型推理的内存高效算法

《无人系统互操作性导论——无人系统联合架构（JAUS）》

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

深度学习自然语言处理

18+阅读 · 2020年5月22日

【论文】Awesome Relation Extraction Paper（关系抽取）（PART V）

【论文】Awesome Relation Extraction Paper（关系抽取）（PART V）

AINLP

38+阅读 · 2019年9月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Relation Networks for Object Detection 论文笔记

Relation Networks for Object Detection 论文笔记

统计学习与视觉计算组

16+阅读 · 2018年4月18日

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

专知

16+阅读 · 2018年2月13日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

相关论文

Generalizing Adam To Manifolds For Efficiently Training Transformers

Arxiv

1+阅读 · 2023年5月26日

MixFormerV2: Efficient Fully Transformer Tracking

Arxiv

0+阅读 · 2023年5月25日

Assessor360: Multi-sequence Network for Blind Omnidirectional Image Quality Assessment

Arxiv

0+阅读 · 2023年5月24日

SmartTrim: Adaptive Tokens and Parameters Pruning for Efficient Vision-Language Models

Arxiv

0+阅读 · 2023年5月24日

Predicting Token Impact Towards Efficient Vision Transformer

Arxiv

0+阅读 · 2023年5月24日

Full Stack Optimization of Transformer Inference: a Survey

Arxiv

19+阅读 · 2023年2月27日

Semantic Models for the First-stage Retrieval: A Comprehensive Review

Arxiv

20+阅读 · 2021年9月17日

Link Prediction on N-ary Relational Facts: A Graph-based Approach

Arxiv

13+阅读 · 2021年5月18日

Transformer Tracking

Arxiv

17+阅读 · 2021年3月29日

Adaptive Correlation Filters with Long-Term and Short-Term Memory for Object Tracking

Arxiv

11+阅读 · 2018年3月23日

相关基金

有理映射的参数空间

国家自然科学基金

0+阅读 · 2013年12月31日

个体化医学中生物标记物预测能力的估计和推断

国家自然科学基金

2+阅读 · 2013年12月31日

老年性EB病毒阳性DLBCL致病基因的筛选及克隆异质性研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

智能摄像机传感网络分布式数据关联方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

面向嵌入式系统的虚拟化技术研究

国家自然科学基金

1+阅读 · 2009年12月31日

基于张量分解和非参量密度建模的偏微分方程目标跟踪研究

国家自然科学基金

0+阅读 · 2009年12月31日

激素暴露及易感SNPs位点与脑卒中风险的关联性研究

国家自然科学基金

0+阅读 · 2009年12月31日

蒙古族人群炎症和内皮功能标志与脑卒中发病关系的研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于随机图模型的蛋白质三级结构预测算法研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员