Histopathology image analysis plays a crucial role in cancer diagnosis. However, training a clinically applicable segmentation algorithm requires pathologists to engage in labour-intensive labelling. In contrast, weakly supervised learning methods, which only require coarse-grained labels at the image level, can significantly reduce the labeling efforts. Unfortunately, while these methods perform reasonably well in slide-level prediction, their ability to locate cancerous regions, which is essential for many clinical applications, remains unsatisfactory. Previously, we proposed CAMEL, which achieves comparable results to those of fully supervised baselines in pixel-level segmentation. However, CAMEL requires 1,280x1,280 image-level binary annotations for positive WSIs. Here, we present CAMEL2, by introducing a threshold of the cancerous ratio for positive bags, it allows us to better utilize the information, consequently enabling us to scale up the image-level setting from 1,280x1,280 to 5,120x5,120 while maintaining the accuracy. Our results with various datasets, demonstrate that CAMEL2, with the help of 5,120x5,120 image-level binary annotations, which are easy to annotate, achieves comparable performance to that of a fully supervised baseline in both instance- and slide-level classifications.
翻译:组织病理学图像分析在癌症诊断中发挥着关键作用。然而,训练可在临床上应用的图像分割算法需要病理学家投入大量人力进行标注。相比之下,仅需图像级粗粒度标签的弱监督学习方法可显著减少标注工作量。遗憾的是,尽管此类方法在切片级预测中表现尚可,但其在定位癌变区域(这对许多临床应用至关重要)的能力仍不尽如人意。此前,我们提出了CAMEL方法,该方法在像素级分割中取得了与全监督基线相当的效果,但需要针对阳性WSI提供1280×1280的图像级二值标注。本研究提出的CAMEL2方法,通过引入癌变区域占比阈值优化阳性包内信息利用方式,从而在保持精度的同时将图像级标注尺度从1280×1280提升至5120×5120。多数据集实验表明,借助易于标注的5120×5120图像级二值标注,CAMEL2在实例级和切片级分类中均达到了与全监督基线相媲美的性能。