In this paper, we provide an intuitive viewing to simplify the Siamese-based trackers by converting the tracking task to a classification. Under this viewing, we perform an in-depth analysis for them through visual simulations and real tracking examples, and find that the failure cases in some challenging situations can be regarded as the issue of missing decisive samples in offline training. Since the samples in the initial (first) frame contain rich sequence-specific information, we can regard them as the decisive samples to represent the whole sequence. To quickly adapt the base model to new scenes, a compact latent network is presented via fully using these decisive samples. Specifically, we present a statistics-based compact latent feature for fast adjustment by efficiently extracting the sequence-specific information. Furthermore, a new diverse sample mining strategy is designed for training to further improve the discrimination ability of the proposed compact latent network. Finally, a conditional updating strategy is proposed to efficiently update the basic models to handle scene variation during the tracking phase. To evaluate the generalization ability and effectiveness and of our method, we apply it to adjust three classical Siamese-based trackers, namely SiamRPN++, SiamFC, and SiamBAN. Extensive experimental results on six recent datasets demonstrate that all three adjusted trackers obtain the superior performance in terms of the accuracy, while having high running speed.
翻译:本文提供了一种直观视角,通过将跟踪任务转换为分类问题来简化基于暹罗网络的跟踪器。在此视角下,我们通过视觉模拟和真实跟踪示例对其进行了深入分析,发现挑战性场景中的失败案例可视为离线训练中缺失关键样本的问题。由于初始帧(第一帧)中的样本包含丰富的序列特定信息,我们将其视为表示整个序列的关键样本。为了快速使基础模型适应新场景,我们通过充分利用这些关键样本提出了一种紧凑潜在网络。具体而言,我们提出了一种基于统计的紧凑潜在特征,通过高效提取序列特定信息实现快速调整。此外,我们设计了一种新的多样本挖掘训练策略,以进一步提升所提紧凑潜在网络的判别能力。最后,提出了一种条件更新策略,在跟踪阶段高效更新基础模型以应对场景变化。为评估方法的泛化能力和有效性,我们将其应用于三种经典暹罗跟踪器(即SiamRPN++、SiamFC和SiamBAN)的调整。在六个近期数据集上的大量实验结果表明,所有三种经调整的跟踪器在保持高运行速度的同时,均取得了优越的准确率性能。