Classification of gigapixel Whole Slide Images (WSIs) is an important prediction task in the emerging area of computational pathology. There has been a surge of research in deep learning models for WSI classification with clinical applications such as cancer detection or prediction of molecular mutations from WSIs. Most methods require expensive and labor-intensive manual annotations by expert pathologists. Weakly supervised Multiple Instance Learning (MIL) methods have recently demonstrated excellent performance; however, they still require large slide-level labeled training datasets that need a careful inspection of each slide by an expert pathologist. In this work, we propose a fully unsupervised WSI classification algorithm based on mutual transformer learning. Instances from gigapixel WSI (i.e., image patches) are transformed into a latent space and then inverse-transformed to the original space. Using the transformation loss, pseudo-labels are generated and cleaned using a transformer label-cleaner. The proposed transformer-based pseudo-label generation and cleaning modules mutually train each other iteratively in an unsupervised manner. A discriminative learning mechanism is introduced to improve normal versus cancerous instance labeling. In addition to unsupervised classification, we demonstrate the effectiveness of the proposed framework for weak supervision for cancer subtype classification as downstream analysis. Extensive experiments on four publicly available datasets show excellent performance compared to the state-of-the-art methods. We intend to make the source code of our algorithm publicly available soon.
翻译:千兆像素全切片图像的分类是计算病理学新兴领域中的重要预测任务。针对全切片图像分类的深度学习模型研究激增,其临床应用包括癌症检测或从全切片图像预测分子突变。大多数方法需要病理学专家进行昂贵且劳动密集型的人工标注。弱监督多实例学习方法近期展现出优异性能,但仍需大量经过病理学专家逐张仔细检查的切片级别标注训练数据集。本研究提出一种基于互变换学习的全无监督全切片图像分类算法。将千兆像素全切片中的实例(即图像块)变换至潜在空间后逆变换回原始空间,利用变换损失生成伪标签,并通过变换器标签清洗器进行清洗。所提出的基于变换器的伪标签生成与清洗模块以无监督方式迭代互训。引入判别学习机制以改善正常与癌变实例的标记质量。除无监督分类外,我们进一步展示了所提框架在癌症亚型分类下游分析中的弱监督有效性。在四个公开数据集上的大量实验表明,该方法相比现有技术具有优异性能。我们计划近期公开算法源代码。