Multi-spectral image stitching leverages the complementarity between infrared and visible images to generate a robust and reliable wide field-of-view (FOV) scene. The primary challenge of this task is to explore the relations between multi-spectral images for aligning and integrating multi-view scenes. Capitalizing on the strengths of Graph Convolutional Networks (GCNs) in modeling feature relationships, we propose a spatial graph reasoning based multi-spectral image stitching method that effectively distills the deformation and integration of multi-spectral images across different viewpoints. To accomplish this, we embed multi-scale complementary features from the same view position into a set of nodes. The correspondence across different views is learned through powerful dense feature embeddings, where both inter- and intra-correlations are developed to exploit cross-view matching and enhance inner feature disparity. By introducing long-range coherence along spatial and channel dimensions, the complementarity of pixel relations and channel interdependencies aids in the reconstruction of aligned multi-view features, generating informative and reliable wide FOV scenes. Moreover, we release a challenging dataset named ChaMS, comprising both real-world and synthetic sets with significant parallax, providing a new option for comprehensive evaluation. Extensive experiments demonstrate that our method surpasses the state-of-the-arts.
翻译:多光谱图像拼接利用红外与可见光图像的互补性,生成稳健可靠的宽视场场景。该任务的核心挑战在于探索多光谱图像之间的关联,以实现多视角场景的对齐与融合。借助图卷积网络在建模特征关系方面的优势,我们提出一种基于空间图推理的多光谱图像拼接方法,有效提取了不同视角下多光谱图像的形变与融合特征。为此,我们将同一视角位置的多尺度互补特征嵌入到一组节点中。通过强大的密集特征嵌入学习不同视角间的对应关系,同时构建跨视角匹配的内部与外部相关性,以增强特征差异。通过沿空间和通道维度引入长程一致性,像素关系与通道依赖性的互补性有助于重建对齐后的多视角特征,生成信息丰富且可靠的宽视场场景。此外,我们发布了一个名为ChaMS的挑战性数据集,包含具有显著视差的真实场景与合成场景数据,为全面评估提供了新的选择。大量实验证明,我们的方法超越了现有最先进技术。