Efficient visual trackers overfit to their training distributions and lack generalization abilities, resulting in them performing well on their respective in-distribution (ID) test sets and not as well on out-of-distribution (OOD) sequences, imposing limitations to their deployment in-the-wild under constrained resources. We introduce SiamABC, a highly efficient Siamese tracker that significantly improves tracking performance, even on OOD sequences. SiamABC takes advantage of new architectural designs in the way it bridges the dynamic variability of the target, and of new losses for training. Also, it directly addresses OOD tracking generalization by including a fast backward-free dynamic test-time adaptation method that continuously adapts the model according to the dynamic visual changes of the target. Our extensive experiments suggest that SiamABC shows remarkable performance gains in OOD sets while maintaining accurate performance on the ID benchmarks. SiamABC outperforms MixFormerV2-S by 7.6\% on the OOD AVisT benchmark while being 3x faster (100 FPS) on a CPU.
翻译:高效的视觉跟踪器往往过度拟合其训练分布且缺乏泛化能力,导致它们在各自分布内(ID)测试集上表现良好,而在分布外(OOD)序列上表现欠佳,这限制了其在资源受限的野外环境中的部署。我们提出了SiamABC,一种高效的孪生网络跟踪器,能显著提升跟踪性能,即使在OOD序列上也是如此。SiamABC利用了新颖的架构设计来桥接目标的动态变化性,并采用了新的训练损失函数。此外,它通过引入一种快速的无反向传播动态测试时适应方法,直接应对OOD跟踪的泛化问题,该方法能根据目标的动态视觉变化持续调整模型。我们的大量实验表明,SiamABC在OOD数据集上表现出显著的性能提升,同时在ID基准测试中保持精确性能。在OOD AVisT基准测试中,SiamABC以3倍的速度(100 FPS)在CPU上运行,性能超越MixFormerV2-S达7.6%。