Performance of modern trackers degrades substantially on transparent objects compared to opaque objects. This is largely due to two distinct reasons. Transparent objects are unique in that their appearance is directly affected by the background. Furthermore, transparent object scenes often contain many visually similar objects (distractors), which often lead to tracking failure. However, development of modern tracking architectures requires large training sets, which do not exist in transparent object tracking. We present two contributions addressing the aforementioned issues. We propose the first transparent object tracking training dataset Trans2k that consists of over 2k sequences with 104,343 images overall, annotated by bounding boxes and segmentation masks. Standard trackers trained on this dataset consistently improve by up to 16%. Our second contribution is a new distractor-aware transparent object tracker (DiTra) that treats localization accuracy and target identification as separate tasks and implements them by a novel architecture. DiTra sets a new state-of-the-art in transparent object tracking and generalizes well to opaque objects.
翻译:现代追踪器在透明目标上的性能相较于不透明目标显著下降。这主要归因于两个不同原因:透明目标的独特之处在于其外观直接受背景影响;此外,透明目标场景中常包含大量视觉相似物体(干扰物),这往往导致追踪失败。然而,现代追踪架构的开发需要大规模训练集,而透明目标追踪领域尚缺乏此类数据集。我们提出两项贡献以解决上述问题:首先提出首个透明目标追踪训练数据集Trans2k,该数据集包含超过2000个序列(总计104,343张图像),并通过边界框与分割掩码完成标注。基于该数据集训练的通用追踪器性能持续提升高达16%。其次,我们提出新型干扰感知透明目标追踪器(DiTra),将定位精度与目标识别作为独立任务处理,并通过创新架构实现。DiTra在透明目标追踪领域达到新最优水平,并展现出对不透明目标的良好泛化能力。