Long-Term tracking is a hot topic in Computer Vision. In this context, competitive models are presented every year, showing a constant growth rate in performances, mainly measured in standardized protocols as Visual Object Tracking (VOT) and Object Tracking Benchmark (OTB). Fusion-trackers strategy has been applied over last few years for overcoming the known re-detection problem, turning out to be an important breakthrough. Following this approach, this work aims to generalize the fusion concept to an arbitrary number of trackers used as baseline trackers in the pipeline, leveraging a learning phase to better understand how outcomes correlate with each other, even when no target is present. A model and data independence conjecture will be evidenced in the manuscript, yielding a recall of 0.738 on LTB-50 dataset when learning from VOT-LT2022, and 0.619 by reversing the two datasets. In both cases, results are strongly competitive with state-of-the-art and recall turns out to be the first on the podium.
翻译:长期跟踪是计算机视觉领域的一个热点课题。每年都会涌现出具有竞争力的模型,其性能持续增长,主要通过标准化协议如视觉目标跟踪(VOT)和目标跟踪基准(OTB)进行衡量。近年来,融合跟踪器策略被应用于克服已知的重新检测问题,成为一项重要突破。遵循这一思路,本研究旨在将融合概念推广至管道中作为基线跟踪器的任意数量跟踪器,利用学习阶段更好地理解各输出结果之间的相关性,即使在无目标存在时也能实现。本文将通过实验验证模型与数据无关性假设,在LTB-50数据集上,基于VOT-LT2022学习时召回率达到0.738,交换两个数据集时召回率为0.619。两种情况下,结果均与现有最优方法高度竞争,且召回率指标位居榜首。