Matching visible and near-infrared (NIR) images remains a significant challenge in remote sensing image fusion. The nonlinear radiometric differences between heterogeneous remote sensing images make the image matching task even more difficult. Deep learning has gained substantial attention in computer vision tasks in recent years. However, many methods rely on supervised learning and necessitate large amounts of annotated data. Nevertheless, annotated data is frequently limited in the field of remote sensing image matching. To address this challenge, this paper proposes a novel keypoint descriptor approach that obtains robust feature descriptors via a self-supervised matching network. A light-weight transformer network, termed as LTFormer, is designed to generate deep-level feature descriptors. Furthermore, we implement an innovative triplet loss function, LT Loss, to enhance the matching performance further. Our approach outperforms conventional hand-crafted local feature descriptors and proves equally competitive compared to state-of-the-art deep learning-based methods, even amidst the shortage of annotated data.
翻译:可见光与近红外图像的匹配仍然是遥感图像融合中的重大挑战。异源遥感图像之间的非线性辐射差异使得图像匹配任务更加困难。近年来,深度学习在计算机视觉领域获得了广泛关注。然而,许多方法依赖监督学习并需要大量标注数据。但在遥感图像匹配领域,标注数据往往十分有限。为解决这一难题,本文提出一种新颖的关键点描述子方法,通过自监督匹配网络获取鲁棒的特征描述子。我们设计了一种轻量级Transformer网络——LTFormer,用于生成深层特征描述子。此外,我们创新性地引入了三元组损失函数LT Loss以进一步提升匹配性能。与传统的基于手工设计的局部特征描述子相比,我们的方法表现更优;即便在标注数据匮乏的情况下,该方法与当前最先进的深度学习方法相比仍具有同等竞争力。