We present REMM, a rotation-equivariant framework for end-to-end multimodal image matching, which fully encodes rotational differences of descriptors in the whole matching pipeline. Previous learning-based methods mainly focus on extracting modal-invariant descriptors, while consistently ignoring the rotational invariance. In this paper, we demonstrate that our REMM is very useful for multimodal image matching, including multimodal feature learning module and cyclic shift module. We first learn modal-invariant features through the multimodal feature learning module. Then, we design the cyclic shift module to rotationally encode the descriptors, greatly improving the performance of rotation-equivariant matching, which makes them robust to any angle. To validate our method, we establish a comprehensive rotation and scale-matching benchmark for evaluating the anti-rotation performance of multimodal images, which contains a combination of multi-angle and multi-scale transformations from four publicly available datasets. Extensive experiments show that our method outperforms existing methods in benchmarking and generalizes well to independent datasets. Additionally, we conducted an in-depth analysis of the key components of the REMM to validate the improvements brought about by the cyclic shift module. Code and dataset at https://github.com/HanNieWHU/REMM.
翻译:本文提出REMM,一种面向端到端多模态图像匹配的旋转等变框架,该框架在整个匹配流程中完整编码了描述子的旋转差异。现有基于学习的方法主要聚焦于提取模态不变描述子,却始终忽视旋转不变性。本文通过多模态特征学习模块与循环移位模块构建的REMM框架,证明了其对多模态图像匹配的有效性。我们首先通过多模态特征学习模块学习模态不变特征;随后设计循环移位模块对描述子进行旋转编码,显著提升了旋转等变匹配性能,使其对任意角度变化具有鲁棒性。为验证方法有效性,我们构建了涵盖旋转与尺度匹配的综合基准测试集,用于评估多模态图像的抗旋转性能。该基准集融合了来自四个公开数据集的多种角度与尺度变换组合。大量实验表明,本方法在基准测试中优于现有方法,并在独立数据集上展现出良好泛化能力。此外,我们对REMM的关键组件进行了深入分析,验证了循环移位模块带来的性能提升。代码与数据集详见 https://github.com/HanNieWHU/REMM。