Nuclei detection and segmentation in hematoxylin and eosin-stained (H&E) tissue images are important clinical tasks and crucial for a wide range of applications. However, it is a challenging task due to nuclei variances in staining and size, overlapping boundaries, and nuclei clustering. While convolutional neural networks have been extensively used for this task, we explore the potential of Transformer-based networks in this domain. Therefore, we introduce a new method for automated instance segmentation of cell nuclei in digitized tissue samples using a deep learning architecture based on Vision Transformer called CellViT. CellViT is trained and evaluated on the PanNuke dataset, which is one of the most challenging nuclei instance segmentation datasets, consisting of nearly 200,000 annotated Nuclei into 5 clinically important classes in 19 tissue types. We demonstrate the superiority of large-scale in-domain and out-of-domain pre-trained Vision Transformers by leveraging the recently published Segment Anything Model and a ViT-encoder pre-trained on 104 million histological image patches - achieving state-of-the-art nuclei detection and instance segmentation performance on the PanNuke dataset with a mean panoptic quality of 0.50 and an F1-detection score of 0.83. The code is publicly available at https://github.com/TIO-IKIM/CellViT
翻译:在苏木精-伊红染色组织图像中进行细胞核检测与分割是重要的临床任务,对多种应用至关重要。然而,由于染色差异、细胞核大小不一、边界重叠以及细胞核聚类等因素,这一任务极具挑战性。尽管卷积神经网络已广泛应用于该领域,本研究探索了基于Transformer网络的潜力。为此,我们提出了一种名为CellViT的新方法,基于视觉Transformer深度学习架构,实现数字化组织样本中细胞核的自动实例分割。CellViT在PanNuke数据集上进行训练与评估——该数据集是最具挑战性的细胞核实例分割数据集之一,包含来自19种组织类型的近20万个标注细胞核,分为5个临床重要类别。通过利用近期发布的Segment Anything模型及预训练于1.04亿张组织学图像块的ViT编码器,我们证明了大规模域内与域外预训练视觉Transformer的优越性——在PanNuke数据集上实现了当前最优的细胞核检测与实例分割性能,平均全景质量达0.50,F1检测得分为0.83。代码已开源:https://github.com/TIO-IKIM/CellViT