Image registration under domain shift remains a fundamental challenge in computer vision and medical imaging: when source and target images exhibit systematic intensity differences, the brightness constancy assumption underlying conventional registration methods is violated, rendering correspondence estimation ill-posed. We propose SAR-Net, a unified framework that addresses this challenge through principled scene-appearance disentanglement. Our key insight is that observed images can be decomposed into domain-invariant scene representations and domain-specific appearance codes, enabling registration via re-rendering rather than direct intensity matching. We establish theoretical conditions under which this decomposition enables consistent cross-domain alignment (Proposition 1) and prove that our scene consistency loss provides a sufficient condition for geometric correspondence in the shared latent space (Proposition 2). Empirically, we validate SAR-Net on the ANHIR (Automatic Non-rigid Histological Image Registration) challenge benchmark, where multi-stain histopathology images exhibit coupled domain shift from different staining protocols and geometric distortion from tissue preparation. Our method achieves a median relative Target Registration Error (rTRE) of 0.25%, outperforming the state-of-the-art MEVIS method (0.27% rTRE) by 7.4%, with robustness of 99.1%. Code is available at https://github.com/D-ST-Sword/SAR-NET
翻译:跨域图像配准在计算机视觉和医学影像领域仍是一个根本性挑战:当源图像与目标图像存在系统性强度差异时,传统配准方法所依赖的亮度恒定假设将被破坏,导致对应关系估计成为不适定问题。本文提出SAR-Net这一统一框架,通过原理性的场景-外观解耦来解决该挑战。我们的核心洞见在于:观测图像可分解为域不变的场景表示与域特定的外观编码,从而通过重渲染而非直接强度匹配实现配准。我们建立了该分解能够实现跨域一致对齐的理论条件(命题1),并证明了场景一致性损失为共享隐空间中的几何对应关系提供了充分条件(命题2)。在实证研究中,我们在ANHIR(自动非刚性组织学图像配准)挑战基准上验证了SAR-Net的有效性。该基准中的多染色组织病理学图像同时存在染色方案差异导致的耦合域偏移和组织制备过程引起的几何形变。我们的方法取得了0.25%的相对目标配准误差(rTRE)中位数,以7.4%的优势超越了当前最先进的MEVIS方法(0.27% rTRE),并具有99.1%的鲁棒性。代码发布于https://github.com/D-ST-Sword/SAR-NET