Accurate tracking of transparent objects, such as glasses, plays a critical role in many robotic tasks such as robot-assisted living. Due to the adaptive and often reflective texture of such objects, traditional tracking algorithms that rely on general-purpose learned features suffer from reduced performance. Recent research has proposed to instill transparency awareness into existing general object trackers by fusing purpose-built features. However, with the existing fusion techniques, the addition of new features causes a change in the latent space making it impossible to incorporate transparency awareness on trackers with fixed latent spaces. For example, many of the current days transformer-based trackers are fully pre-trained and are sensitive to any latent space perturbations. In this paper, we present a new feature fusion technique that integrates transparency information into a fixed feature space, enabling its use in a broader range of trackers. Our proposed fusion module, composed of a transformer encoder and an MLP module, leverages key query-based transformations to embed the transparency information into the tracking pipeline. We also present a new two-step training strategy for our fusion module to effectively merge transparency features. We propose a new tracker architecture that uses our fusion techniques to achieve superior results for transparent object tracking. Our proposed method achieves competitive results with state-of-the-art trackers on TOTB, which is the largest transparent object tracking benchmark recently released. Our results and the implementation of code will be made publicly available at https://github.com/kalyan0510/TOTEM.
翻译:精确跟踪透明物体(例如玻璃杯)在机器人辅助生活等众多机器人任务中起着关键作用。由于此类物体具有自适应且往往具备反射性的纹理,依赖通用学习特征的传统跟踪算法性能会下降。近期研究提出通过融合专用特征,将透明感知能力植入现有的通用物体跟踪器中。然而,现有融合技术会导致新增特征改变潜在空间,从而无法将透明感知融入具有固定潜在空间的跟踪器。例如,当前许多基于Transformer的跟踪器完全预训练完成,对任何潜在空间扰动都十分敏感。本文提出一种新的特征融合技术,能将透明信息整合到固定特征空间中,使其适用于更广泛的跟踪器。我们提出的融合模块由Transformer编码器和MLP模块组成,利用基于关键点查询的变换将透明信息嵌入跟踪流程。我们还为融合模块设计了一种新的两步训练策略,以有效融合透明特征。我们提出一种新的跟踪器架构,采用融合技术实现了透明物体跟踪的卓越性能。在最新发布的最大规模透明物体跟踪基准TOTB上,我们的方法达到了与最先进跟踪器相媲美的结果。相关结果及代码实现将在https://github.com/kalyan0510/TOTEM公开。