As a new emerging and promising type of generative models, diffusion models have proven to outperform Generative Adversarial Networks (GANs) in multiple tasks, including image synthesis. In this work, we explore semantic image synthesis for abdominal CT using conditional diffusion models, which can be used for downstream applications such as data augmentation. We systematically evaluated the performance of three diffusion models, as well as to other state-of-the-art GAN-based approaches, and studied the different conditioning scenarios for the semantic mask. Experimental results demonstrated that diffusion models were able to synthesize abdominal CT images with better quality. Additionally, encoding the mask and the input separately is more effective than na\"ive concatenating.
翻译:作为一种新兴且前景广阔的生成模型,扩散模型已被证明在包括图像合成在内的多项任务中优于生成对抗网络(GANs)。在本研究中,我们探索了利用条件扩散模型进行腹部CT的语义图像合成,该方法可用于数据增强等下游应用场景。我们系统评估了三种扩散模型的性能,并将它们与其他基于GAN的先进方法进行对比,同时研究了语义掩码在不同条件设置下的表现。实验结果表明,扩散模型能够合成质量更优的腹部CT图像。此外,相较于简单的直接拼接方式,将掩码与输入分别编码能取得更佳效果。