This article presents a general Bayesian learning framework for multi-modal groupwise image registration. The method builds on probabilistic modelling of the image generative process, where the underlying common anatomy and geometric variations of the observed images are explicitly disentangled as latent variables. Therefore, groupwise image registration is achieved via hierarchical Bayesian inference. We propose a novel hierarchical variational auto-encoding architecture to realise the inference procedure of the latent variables, where the registration parameters can be explicitly estimated in a mathematically interpretable fashion. Remarkably, this new paradigm learns groupwise image registration in an unsupervised closed-loop self-reconstruction process, sparing the burden of designing complex image-based similarity measures. The computationally efficient disentangled network architecture is also inherently scalable and flexible, allowing for groupwise registration on large-scale image groups with variable sizes. Furthermore, the inferred structural representations from multi-modal images via disentanglement learning are capable of capturing the latent anatomy of the observations with visual semantics. Extensive experiments were conducted to validate the proposed framework, including four different datasets from cardiac, brain, and abdominal medical images. The results have demonstrated the superiority of our method over conventional similarity-based approaches in terms of accuracy, efficiency, scalability, and interpretability.
翻译:本文提出了一种用于多模态群组图像配准的通用贝叶斯学习框架。该方法基于图像生成过程的概率建模,将观测图像中潜在的共同解剖结构与几何变化显式解耦为隐变量。因此,群组图像配准通过分层贝叶斯推理实现。我们提出了一种新颖的分层变分自编码架构来实现隐变量的推理过程,其中配准参数能以数学可解释的方式显式估计。值得注意的是,这一新范式通过无监督闭环自重建过程学习群组图像配准,避免了设计复杂图像相似性度量的负担。计算高效的解耦网络架构本身具备可扩展性和灵活性,支持对可变规模的大尺度图像群组进行配准。此外,通过解耦学习从多模态图像中推断出的结构表征能够捕捉具有视觉语义的观测数据潜在解剖特征。我们进行了大量实验验证所提框架,包括心脏、大脑和腹部医学影像的四个不同数据集。结果表明,在准确性、效率、可扩展性和可解释性方面,本方法优于传统的基于相似性的配准方法。