Image Synthesis with Class-Aware Semantic Diffusion Models for Surgical Scene Segmentation

Surgical scene segmentation is essential for enhancing surgical precision, yet it is frequently compromised by the scarcity and imbalance of available data. To address these challenges, semantic image synthesis methods based on generative adversarial networks and diffusion models have been developed. However, these models often yield non-diverse images and fail to capture small, critical tissue classes, limiting their effectiveness. In response, we propose the Class-Aware Semantic Diffusion Model (CASDM), a novel approach which utilizes segmentation maps as conditions for image synthesis to tackle data scarcity and imbalance. Novel class-aware mean squared error and class-aware self-perceptual loss functions have been defined to prioritize critical, less visible classes, thereby enhancing image quality and relevance. Furthermore, to our knowledge, we are the first to generate multi-class segmentation maps using text prompts in a novel fashion to specify their contents. These maps are then used by CASDM to generate surgical scene images, enhancing datasets for training and validating segmentation models. Our evaluation, which assesses both image quality and downstream segmentation performance, demonstrates the strong effectiveness and generalisability of CASDM in producing realistic image-map pairs, significantly advancing surgical scene segmentation across diverse and challenging datasets.

翻译：手术场景分割对于提升手术精度至关重要，但常因可用数据的稀缺与不平衡而受限。为应对这些挑战，基于生成对抗网络和扩散模型的语义图像合成方法已被开发出来。然而，这些模型通常生成多样性不足的图像，且难以捕捉关键的小型组织类别，从而限制了其有效性。为此，我们提出了类别感知语义扩散模型（CASDM），这是一种利用分割图作为图像合成条件以应对数据稀缺与不平衡问题的新方法。我们定义了新颖的类别感知均方误差损失和类别感知自感知损失函数，以优先处理关键且可见度较低的类别，从而提升图像质量与相关性。此外，据我们所知，我们首次以新颖的方式利用文本提示来生成多类别分割图，以指定其内容。这些分割图随后被CASDM用于生成手术场景图像，从而增强用于训练和验证分割模型的数据集。我们的评估同时考量了图像质量和下游分割性能，证明了CASDM在生成逼真的图像-分割图对方面具有强大的有效性和泛化能力，显著推动了跨多样且具挑战性数据集的手术场景分割研究。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/