Integrating multi-modal data to promote medical image analysis has recently gained great attention. This paper presents a novel scheme to learn the mutual benefits of different modalities to achieve better segmentation results for unpaired multi-modal medical images. Our approach tackles two critical issues of this task from a practical perspective: (1) how to effectively learn the semantic consistencies of various modalities (e.g., CT and MRI), and (2) how to leverage the above consistencies to regularize the network learning while preserving its simplicity. To address (1), we leverage a carefully designed External Attention Module (EAM) to align semantic class representations and their correlations of different modalities. To solve (2), the proposed EAM is designed as an external plug-and-play one, which can be discarded once the model is optimized. We have demonstrated the effectiveness of the proposed method on two medical image segmentation scenarios: (1) cardiac structure segmentation, and (2) abdominal multi-organ segmentation. Extensive results show that the proposed method outperforms its counterparts by a wide margin.
翻译:整合多模态数据以促进医学图像分析近年来备受关注。本文提出了一种新颖方案,通过学习不同模态之间的相互增益,以实现非配对多模态医学图像的更优分割结果。从实际应用角度出发,我们的方法解决了该任务的两个关键问题:(1)如何有效学习不同模态(例如CT和MRI)的语义一致性;(2)如何在保持网络简洁性的前提下,利用上述一致性对网络学习进行正则化。针对问题(1),我们利用精心设计的外部注意力模块(EAM)来对齐不同模态的语义类别表征及其相关性。为解决(2),所提出的EAM被设计为外部即插即用模块,在模型优化完成后可被丢弃。我们在两个医学图像分割场景中验证了该方法的有效性:(1)心脏结构分割,(2)腹部多器官分割。大量实验结果表明,所提方法以显著优势优于同类方法。