Human-machine Interactive Tissue Prototype Learning for Label-efficient Histopathology Image Segmentation

Recently, deep neural networks have greatly advanced histopathology image segmentation but usually require abundant annotated data. However, due to the gigapixel scale of whole slide images and pathologists' heavy daily workload, obtaining pixel-level labels for supervised learning in clinical practice is often infeasible. Alternatively, weakly-supervised segmentation methods have been explored with less laborious image-level labels, but their performance is unsatisfactory due to the lack of dense supervision. Inspired by the recent success of self-supervised learning methods, we present a label-efficient tissue prototype dictionary building pipeline and propose to use the obtained prototypes to guide histopathology image segmentation. Particularly, taking advantage of self-supervised contrastive learning, an encoder is trained to project the unlabeled histopathology image patches into a discriminative embedding space where these patches are clustered to identify the tissue prototypes by efficient pathologists' visual examination. Then, the encoder is used to map the images into the embedding space and generate pixel-level pseudo tissue masks by querying the tissue prototype dictionary. Finally, the pseudo masks are used to train a segmentation network with dense supervision for better performance. Experiments on two public datasets demonstrate that our human-machine interactive tissue prototype learning method can achieve comparable segmentation performance as the fully-supervised baselines with less annotation burden and outperform other weakly-supervised methods. Codes will be available upon publication.

翻译：近年来，深度神经网络极大地推动了组织病理学图像分割的发展，但通常需要大量标注数据。然而，由于全切片图像达到十亿像素级别，且病理学家日常工作量繁重，在临床实践中获取用于监督学习的像素级标签往往难以实现。为此，研究者探索了基于图像级标签的弱监督分割方法，但这类方法因缺乏密集监督而性能欠佳。受近期自监督学习方法成功经验的启发，我们提出了一种标签高效的组织原型字典构建流程，并利用所获得的原型指导组织病理学图像分割。具体而言，借助自监督对比学习，我们训练编码器将未标注的组织病理学图像块投影至具有判别性的嵌入空间，通过高效病理学家的视觉检查对这些图像块进行聚类以识别组织原型。随后，利用该编码器将图像映射至嵌入空间，并通过查询组织原型字典生成像素级伪组织掩膜。最后，使用这些伪掩膜训练分割网络，通过密集监督获得更优性能。在两个公共数据集上的实验表明，我们的人机交互式组织原型学习方法在减少标注负担的同时，能够达到与全监督基线相当的分割性能，并超越其他弱监督方法。相关代码将在发表后公开。