Cross-modal object tracking is an important research topic in the field of information fusion, and it aims to address imaging limitations in challenging scenarios by integrating switchable visible and near-infrared modalities. However, existing tracking methods face some difficulties in adapting to significant target appearance variations in the presence of modality switch. For instance, model update based tracking methods struggle to maintain stable tracking results during modality switching, leading to error accumulation and model drift. Template based tracking methods solely rely on the template information from first frame and/or last frame, which lacks sufficient representation ability and poses challenges in handling significant target appearance changes. To address this problem, we propose a prototype-based cross-modal object tracker called ProtoTrack, which introduces a novel prototype learning scheme to adapt to significant target appearance variations, for cross-modal object tracking. In particular, we design a multi-modal prototype to represent target information by multi-kind samples, including a fixed sample from the first frame and two representative samples from different modalities. Moreover, we develop a prototype generation algorithm based on two new modules to ensure the prototype representative in different challenges......
翻译:跨模态目标跟踪是信息融合领域的重要研究方向,旨在通过集成可切换的可见光与近红外模态,解决复杂场景中的成像限制问题。然而,现有跟踪方法在模态切换时难以适应显著的目标外观变化。例如,基于模型更新的跟踪方法在模态切换过程中难以维持稳定跟踪结果,易导致误差累积和模型漂移;基于模板的跟踪方法仅依赖首帧和/或末帧的模板信息,缺乏充分表达能力,难以应对显著的目标外观变化。针对这一问题,本文提出一种名为ProtoTrack的原型驱动跨模态目标跟踪器,通过引入新型原型学习机制来适应跨模态目标跟踪中的显著外观变化。具体而言,我们设计了一种多模态原型,利用多种样本(包括首帧固定样本与来自不同模态的两个代表性样本)表征目标信息。同时,我们开发了基于两个新模块的原型生成算法,以确保原型在不同挑战场景下的代表性......