Video-assisted transoral tracheal intubation (TI) necessitates using an endoscope that helps the physician insert a tracheal tube into the glottis instead of the esophagus. The growing trend of robotic-assisted TI would require a medical robot to distinguish anatomical features like an experienced physician which can be imitated by utilizing supervised deep-learning techniques. However, the real datasets of oropharyngeal organs are often inaccessible due to limited open-source data and patient privacy. In this work, we propose a domain adaptive Sim-to-Real framework called IoU-Ranking Blend-ArtFlow (IRB-AF) for image segmentation of oropharyngeal organs. The framework includes an image blending strategy called IoU-Ranking Blend (IRB) and style-transfer method ArtFlow. Here, IRB alleviates the problem of poor segmentation performance caused by significant datasets domain differences; while ArtFlow is introduced to reduce the discrepancies between datasets further. A virtual oropharynx image dataset generated by the SOFA framework is used as the learning subject for semantic segmentation to deal with the limited availability of actual endoscopic images. We adapted IRB-AF with the state-of-the-art domain adaptive segmentation models. The results demonstrate the superior performance of our approach in further improving the segmentation accuracy and training stability.
翻译:视频辅助经口气管插管需要使用内窥镜帮助医生将气管导管插入声门而非食管。机器人辅助气管插管的日益增长趋势要求医疗机器人能像经验丰富的医师一样区分解剖特征,这可通过利用有监督深度学习技术实现。然而,由于开放数据源有限和患者隐私保护,口咽器官的真实数据集往往难以获取。本工作提出一种名为IoU排序混合-ArtFlow(IRB-AF)的域自适应仿真到真实框架,用于口咽器官图像分割。该框架包含称为IoU排序混合(IRB)的图像混合策略和风格迁移方法ArtFlow。其中,IRB方法可缓解因数据集域差异显著而导致的低分割性能问题;同时引入ArtFlow进一步减小数据集间的差异。为应对真实内窥镜图像可用性有限的问题,采用SOFA框架生成的虚拟口咽图像数据集作为语义分割的学习主体。我们将IRB-AF与当前最先进的域自适应分割模型进行适配,结果表明该方法在进一步提升分割精度和训练稳定性方面具有优越性能。