Optimized Information Flow for Transformer Tracking

One-stream Transformer trackers have shown outstanding performance in challenging benchmark datasets over the last three years, as they enable interaction between the target template and search region tokens to extract target-oriented features with mutual guidance. Previous approaches allow free bidirectional information flow between template and search tokens without investigating their influence on the tracker's discriminative capability. In this study, we conducted a detailed study on the information flow of the tokens and based on the findings, we propose a novel Optimized Information Flow Tracking (OIFTrack) framework to enhance the discriminative capability of the tracker. The proposed OIFTrack blocks the interaction from all search tokens to target template tokens in early encoder layers, as the large number of non-target tokens in the search region diminishes the importance of target-specific features. In the deeper encoder layers of the proposed tracker, search tokens are partitioned into target search tokens and non-target search tokens, allowing bidirectional flow from target search tokens to template tokens to capture the appearance changes of the target. In addition, since the proposed tracker incorporates dynamic background cues, distractor objects are successfully avoided by capturing the surrounding information of the target. The OIFTrack demonstrated outstanding performance in challenging benchmarks, particularly excelling in the one-shot tracking benchmark GOT-10k, achieving an average overlap of 74.6\%. The code, models, and results of this work are available at \url{https://github.com/JananiKugaa/OIFTrack}

翻译：单流Transformer跟踪器在过去三年中在具有挑战性的基准数据集上展现出卓越性能，因为它们能够通过相互引导使目标模板与搜索区域令牌之间实现交互，从而提取目标导向特征。以往的方法允许模板和搜索令牌之间进行自由的双向信息流，而未探究其对跟踪器判别能力的影响。在本研究中，我们对令牌的信息流进行了详细研究，并基于研究结果提出了一种新颖的优化信息流跟踪（OIFTrack）框架，以增强跟踪器的判别能力。所提出的OIFTrack在早期编码器层中阻止所有搜索令牌到目标模板令牌的交互，因为搜索区域中大量非目标令牌会削弱目标特定特征的重要性。在提出的跟踪器更深层编码器中，搜索令牌被划分为目标搜索令牌和非目标搜索令牌，允许从目标搜索令牌到模板令牌进行双向信息流，以捕捉目标外观变化。此外，由于所提出的跟踪器融入了动态背景线索，通过捕获目标周围信息成功避免了干扰物。OIFTrack在具有挑战性的基准测试中展现出卓越性能，尤其在一镜跟踪基准GOT-10k上表现突出，平均重叠率达到74.6%。本工作的代码、模型和结果可在\url{https://github.com/JananiKugaa/OIFTrack}获取。