It is highly challenging to register large-scale, heterogeneous SAR and optical images, particularly across platforms, due to significant geometric, radiometric, and temporal differences, which most existing methods struggle to address. To overcome these challenges, we propose Grid-Reg, a grid-based multimodal registration framework comprising a domain-robust descriptor extraction network, Hybrid Siamese Correlation Metric Learning Network (HSCMLNet), and a grid-based solver (Grid-Solver) for transformation parameter estimation. In heterogeneous imagery with large modality gaps and geometric differences, obtaining accurate correspondences is inherently difficult. To robustly measure similarity between gridded patches, HSCMLNet integrates a hybrid Siamese module with a correlation metric learning module (CMLModule) based on equiangular unit basis vectors (EUBVs), together with a manifold consistency loss to promote modality-invariant, discriminative feature learning. The Grid-Solver estimates transformation parameters by minimizing a global grid matching loss through a progressive dual-loop search strategy to reliably find patch correspondences across entire images. Furthermore, we curate a challenging benchmark dataset for SAR-to-optical registration using real-world UAV MiniSAR data and Google Earth optical imagery. Extensive experiments demonstrate that our proposed approach achieves superior performance over state-of-the-art methods.
翻译:在大规模、异质的合成孔径雷达(SAR)与光学图像之间实现配准极具挑战性,尤其是在跨平台场景下,显著的几何、辐射度和时间差异使得多数现有方法难以有效应对。为克服这些挑战,本文提出Grid-Reg,一种基于网格的多模态配准框架。该框架包含一个领域鲁棒的描述符提取网络——混合孪生相关度量学习网络(HSCMLNet),以及一个用于变换参数估计的网格求解器(Grid-Solver)。在模态差异大且几何形变显著的异质图像中,获取精确的对应关系本质上是困难的。为了鲁棒地度量网格化图像块之间的相似性,HSCMLNet集成了一个混合孪生模块与一个基于等角单位基向量(EUBVs)的相关度量学习模块(CMLModule),并结合流形一致性损失,以促进模态不变且具有判别性的特征学习。Grid-Solver通过渐进式双循环搜索策略最小化全局网格匹配损失来估计变换参数,从而可靠地找到整幅图像间的图像块对应关系。此外,我们利用真实世界的无人机MiniSAR数据与谷歌地球光学影像,构建了一个具有挑战性的SAR-光学配准基准数据集。大量实验表明,我们所提出的方法在性能上超越了当前最先进的方法。