Cytology test is effective, non-invasive, convenient, and inexpensive for clinical cancer screening. ThinPrep, a commonly used liquid-based specimen, can be scanned to generate digital whole slide images (WSIs) for cytology testing. However, WSIs classification with gigapixel resolutions is highly resource-intensive, posing significant challenges for automated medical image analysis. In order to circumvent this computational impasse, existing methods emphasize learning features at the cell or patch level, typically requiring labor-intensive and detailed manual annotations, such as labels at the cell or patch level. Here we propose a novel automated Label-Efficient WSI Screening method, dubbed LESS, for cytology-based diagnosis with only slide-level labels. Firstly, in order to achieve label efficiency, we suggest employing variational positive-unlabeled (VPU) learning, enhancing patch-level feature learning using WSI-level labels. Subsequently, guided by the clinical approach of scrutinizing WSIs at varying fields of view and scales, we employ a cross-attention vision transformer (CrossViT) to fuse multi-scale patch-level data and execute WSI-level classification. We validate the proposed label-efficient method on a urine cytology WSI dataset encompassing 130 samples (13,000 patches) and FNAC 2019 dataset with 212 samples (21,200 patches). The experiment shows that the proposed LESS reaches 84.79%, 85.43%, 91.79% and 78.30% on a urine cytology WSI dataset, and 96.53%, 96.37%, 99.31%, 94.95% on FNAC 2019 dataset in terms of accuracy, AUC, sensitivity and specificity. It outperforms state-of-the-art methods and realizes automatic cytology-based bladder cancer screening.
翻译:细胞学检测在临床癌症筛查中具有有效、无创、便捷且成本低廉的优势。以液基薄层细胞学检测(ThinPrep)为代表的常用液基标本可被扫描生成数字全切片图像(WSI),用于细胞学检测。然而,具有千兆像素分辨率的全切片图像分类对计算资源需求极高,为医学图像自动分析带来了重大挑战。为规避这一计算瓶颈,现有方法侧重于在细胞或图像块级别学习特征,通常需要耗费大量人力进行精细的详细标注,例如细胞或图像块级别的标签。本文提出一种新颖的自动化标签高效全切片图像筛查方法(简称LESS),仅利用切片级别标签即可实现基于细胞学的诊断。首先,为实现标签高效性,我们建议采用变分正-无标注(VPU)学习,利用全切片级别标签增强图像块级别的特征学习。随后,受临床实践中在不同视野和尺度下细致观察全切片图像的启发,我们采用交叉注意力视觉Transformer(CrossViT)融合多尺度图像块级数据,并执行全切片级别分类。我们在包含130个样本(13000个图像块)的尿液细胞学全切片图像数据集和包含212个样本(21200个图像块)的FNAC 2019数据集上验证了所提出的标签高效方法。实验表明,所提出的LESS方法在尿液细胞学全切片图像数据集上达到了84.79%的准确率、85.43%的AUC、91.79%的敏感性和78.30%的特异性;在FNAC 2019数据集上分别取得了96.53%的准确率、96.37%的AUC、99.31%的敏感性和94.95%的特异性。该方法优于现有最先进方法,实现了基于细胞学的膀胱癌自动筛查。