Tracking objects can be a difficult task in computer vision, especially when faced with challenges such as occlusion, changes in lighting, and motion blur. Recent advances in deep learning have shown promise in challenging these conditions. However, most deep learning-based object trackers only use visible band (RGB) images. Thermal infrared electromagnetic waves (TIR) can provide additional information about an object, including its temperature, when faced with challenging conditions. We propose a deep learning-based image tracking approach that fuses RGB and thermal images (RGBT). The proposed model consists of two main components: a feature extractor and a tracker. The feature extractor encodes deep features from both the RGB and the TIR images. The tracker then uses these features to track the object using an enhanced attribute-based architecture. We propose a fusion of attribute-specific feature selection with an aggregation module. The proposed methods are evaluated on the RGBT234 \cite{LiCLiang2018} and LasHeR \cite{LiLasher2021} datasets, which are the most widely used RGBT object-tracking datasets in the literature. The results show that the proposed system outperforms state-of-the-art RGBT object trackers on these datasets, with a relatively smaller number of parameters.
翻译:目标跟踪是计算机视觉中的一项困难任务,尤其是在面临遮挡、光照变化和运动模糊等挑战时。近年来深度学习的进展在应对这些条件方面展现出潜力,但大多数基于深度学习的目标跟踪器仅使用可见光波段(RGB)图像。热红外电磁波(TIR)能在复杂条件下提供物体的额外信息,如温度。我们提出一种融合RGB与热红外图像(RGBT)的深度学习跟踪方法。该模型由两个主要组件构成:特征提取器与跟踪器。特征提取器从RGB和TIR图像中编码深度特征,跟踪器则利用这些特征通过增强型属性架构实现目标跟踪。我们提出一种将属性特定特征选择与聚合模块相融合的方法。所提方法在文献中最广泛使用的RGBT目标跟踪数据集RGBT234 \cite{LiCLiang2018}和LasHeR \cite{LiLasher2021}上进行评估。结果表明,本系统在参数数量相对较少的情况下,在这些数据集上优于现有最先进的RGBT目标跟踪器。