RGBT tracking usually suffers from various challenging factors of low resolution, similar appearance, extreme illumination, thermal crossover and occlusion, to name a few. Existing works often study complex fusion models to handle challenging scenarios, but can not well adapt to various challenges, which might limit tracking performance. To handle this problem, we propose a novel Dynamic Disentangled Fusion Network called DDFNet, which disentangles the fusion process into several dynamic fusion models via the challenge attributes to adapt to various challenging scenarios, for robust RGBT tracking. In particular, we design six attribute-based fusion models to integrate RGB and thermal features under the six challenging scenarios respectively.Since each fusion model is to deal with the corresponding challenges, such disentangled fusion scheme could increase the fusion capacity without the dependence on large-scale training data. Considering that every challenging scenario also has different levels of difficulty, we propose to optimize the combination of multiple fusion units to form each attribute-based fusion model in a dynamic manner, which could well adapt to the difficulty of the corresponding challenging scenario. To address the issue that which fusion models should be activated in the tracking process, we design an adaptive aggregation fusion module to integrate all features from attribute-based fusion models in an adaptive manner with a three-stage training algorithm. In addition, we design an enhancement fusion module to further strengthen the aggregated feature and modality-specific features. Experimental results on benchmark datasets demonstrate the effectiveness of our DDFNet against other state-of-the-art methods.
翻译:RGBT跟踪通常面临多种挑战性因素,如低分辨率、相似外观、极端光照、热交叉和遮挡等。现有研究往往通过设计复杂的融合模型来处理挑战性场景,但难以良好适应各类挑战,这可能限制跟踪性能。为解决该问题,我们提出了一种称为DDFNet的新型动态解耦融合网络,该网络通过挑战属性将融合过程解耦为多个动态融合模型,以适应各种挑战性场景,实现鲁棒的RGBT跟踪。具体而言,我们设计了六个基于属性的融合模型,分别在六种挑战性场景下整合RGB与热成像特征。由于每个融合模型专门处理对应的挑战,这种解耦融合方案可在不依赖大规模训练数据的前提下提升融合能力。考虑到每种挑战性场景还存在不同的难度等级,我们提出以动态方式优化多个融合单元的组合来构建每个基于属性的融合模型,从而更好地适应相应挑战性场景的难度。针对跟踪过程中应激活哪些融合模型的问题,我们设计了一个自适应聚合融合模块,通过三阶段训练算法自适应地整合来自各属性融合模型的特征。此外,我们还设计了增强融合模块以进一步强化聚合特征与模态专属特征。在基准数据集上的实验结果表明,我们的DDFNet相较于其他先进方法具有显著优势。