TRAT: Tracking by Attention Using Spatio-Temporal Features

Robust object tracking requires knowledge of tracked objects' appearance, motion and their evolution over time. Although motion provides distinctive and complementary information especially for fast moving objects, most of the recent tracking architectures primarily focus on the objects' appearance information. In this paper, we propose a two-stream deep neural network tracker that uses both spatial and temporal features. Our architecture is developed over ATOM tracker and contains two backbones: (i) 2D-CNN network to capture appearance features and (ii) 3D-CNN network to capture motion features. The features returned by the two networks are then fused with attention based Feature Aggregation Module (FAM). Since the whole architecture is unified, it can be trained end-to-end. The experimental results show that the proposed tracker TRAT (TRacking by ATtention) achieves state-of-the-art performance on most of the benchmarks and it significantly outperforms the baseline ATOM tracker.

翻译：强力物体跟踪要求了解跟踪物体的外观、运动及其随时间演变情况。虽然运动提供了独特和互补的信息,特别是针对快速移动的物体,但最近的跟踪结构大多主要侧重于物体的外观信息。在本文中,我们提出了使用空间和时间特征的双流深神经网络跟踪器。我们的架构是通过ATOM跟踪器开发的,包含两个主干线:(一) 2D-CNN 网络以捕捉外观特征,(二) 3D-CNN 网络以捕捉运动特征。这两个网络返回的功能随后与基于关注的地貌聚合模块(FAM)相结合。由于整个结构是统一的,它可以经过培训的端到端。实验结果显示,拟议的跟踪器TRAT(由注意进行TRAT)在大多数基准上达到最新性表现,并且大大超出基准的ATOM跟踪器。

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

【综述论文】A Survey on Dynamic Network Embedding，动态网络嵌入综述论文

专知会员服务

102+阅读 · 2020年6月16日

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

专知会员服务

38+阅读 · 2020年4月8日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

基于动态时空图CNNs的交通流预测，Dynamic Spatio-temporal Graph-based CNNs for Traffic Flow Prediction

专知会员服务

136+阅读 · 2020年3月8日