Wireless capsule endoscopy (WCE) is a non-invasive method for visualizing the gastrointestinal (GI) tract, crucial for diagnosing GI tract diseases. However, interpreting WCE results can be time-consuming and tiring. Existing studies have employed deep neural networks (DNNs) for automatic GI tract lesion detection, but acquiring sufficient training examples, particularly due to privacy concerns, remains a challenge. Public WCE databases lack diversity and quantity. To address this, we propose a novel approach leveraging generative models, specifically the diffusion model (DM), for generating diverse WCE images. Our model incorporates semantic map resulted from visualization scale (VS) engine, enhancing the controllability and diversity of generated images. We evaluate our approach using visual inspection and visual Turing tests, demonstrating its effectiveness in generating realistic and diverse WCE images.
翻译:无线胶囊内窥镜(WCE)是一种用于可视化胃肠道(GI)的非侵入性方法,对诊断胃肠道疾病至关重要。然而,解读WCE结果既耗时又易疲劳。现有研究已采用深度神经网络(DNN)自动检测胃肠道病变,但由于隐私问题等因素,获取充足的训练样本仍面临挑战。公共WCE数据库在多样性和数量上存在不足。为解决这一问题,我们提出了一种利用生成模型(特别是扩散模型(DM))生成多样化WCE图像的新方法。我们的模型融合了通过可视化尺度(VS)引擎生成的语义图,增强了生成图像的可控性与多样性。通过视觉检查和视觉图灵测试进行评估,结果表明该方法在生成逼真且多样化的WCE图像方面具有有效性。