Creating large-scale and well-annotated datasets to train AI algorithms is crucial for automated tumor detection and localization. However, with limited resources, it is challenging to determine the best type of annotations when annotating massive amounts of unlabeled data. To address this issue, we focus on polyps in colonoscopy videos and pancreatic tumors in abdominal CT scans; both applications require significant effort and time for pixel-wise annotation due to the high dimensional nature of the data, involving either temporary or spatial dimensions. In this paper, we develop a new annotation strategy, termed Drag&Drop, which simplifies the annotation process to drag and drop. This annotation strategy is more efficient, particularly for temporal and volumetric imaging, than other types of weak annotations, such as per-pixel, bounding boxes, scribbles, ellipses, and points. Furthermore, to exploit our Drag&Drop annotations, we develop a novel weakly supervised learning method based on the watershed algorithm. Experimental results show that our method achieves better detection and localization performance than alternative weak annotations and, more importantly, achieves similar performance to that trained on detailed per-pixel annotations. Interestingly, we find that, with limited resources, allocating weak annotations from a diverse patient population can foster models more robust to unseen images than allocating per-pixel annotations for a small set of images. In summary, this research proposes an efficient annotation strategy for tumor detection and localization that is less accurate than per-pixel annotations but useful for creating large-scale datasets for screening tumors in various medical modalities.
翻译:构建大规模且标注完善的训练数据集是实现肿瘤自动检测与定位的关键。然而在资源有限的情况下,为海量未标注数据选择最佳标注类型具有挑战性。针对该问题,本研究聚焦结肠镜视频中的息肉与腹部CT扫描中的胰腺肿瘤——这两类应用由于数据本身的时间或空间高维特性,实现逐像素标注需要耗费大量人力与时间。本文提出一种名为Drag&Drop的新型标注策略,通过简化标注流程为拖放操作,相比像素级标注、边界框、涂鸦、椭圆与点标注等弱标注方式,该方法在时间与体积成像中具有更高效率。同时,为充分运用Drag&Drop标注,我们开发了一种基于分水岭算法的新型弱监督学习方法。实验结果表明,该方法在检测与定位性能上优于其他弱标注方式,更关键的是,其性能与基于详细逐像素标注训练的结果相近。有趣的是,我们发现当资源有限时,从多样化患者群体中分配弱标注所训练的模型,比针对少量图像进行逐像素标注的模型具有更强的未见图像鲁棒性。综上所述,本研究提出了一种高效肿瘤检测与定位标注策略,其精度虽不及逐像素标注,但对构建用于多种医学影像筛查的大规模数据集具有重要价值。