Segmentation of COVID-19 lesions can assist physicians in better diagnosis and treatment of COVID-19. However, there are few relevant studies due to the lack of detailed information and high-quality annotation in the COVID-19 dataset. To solve the above problem, we propose C2FVL, a Coarse-to-Fine segmentation framework via Vision-Language alignment to merge text information containing the number of lesions and specific locations of image information. The introduction of text information allows the network to achieve better prediction results on challenging datasets. We conduct extensive experiments on two COVID-19 datasets including chest X-ray and CT, and the results demonstrate that our proposed method outperforms other state-of-the-art segmentation methods.
翻译:新冠肺炎病灶分割可辅助医生更有效地诊断与治疗新冠肺炎。然而,受限于数据集中缺乏详细标注信息及高质量注释,相关研究较少。为解决上述问题,本文提出C2FVL——一种基于视觉-语言对齐的渐进式分割框架,通过融合包含病灶数量的文本信息与影像中特定位置的图像信息,使得网络能在具有挑战性的数据集上取得更优预测结果。我们在包含胸部X光片和CT图像的两个新冠肺炎数据集上进行了广泛实验,结果表明所提方法优于其他先进分割方法。