Recent advancements in computational pathology and artificial intelligence have significantly improved whole slide image (WSI) classification. However, the gigapixel resolution of WSIs and the scarcity of manual annotations present substantial challenges. Multiple instance learning (MIL) is a promising weakly supervised learning approach for WSI classification. Recently research revealed employing pseudo bag augmentation can encourage models to learn various data, thus bolstering models' performance. While directly inheriting the parents' labels can introduce more noise by mislabeling in training. To address this issue, we translate the WSI classification task from weakly supervised learning to semi-weakly supervised learning, termed SWS-MIL, where adaptive pseudo bag augmentation (AdaPse) is employed to assign labeled and unlabeled data based on a threshold strategy. Using the "student-teacher" pattern, we introduce a feature augmentation technique, MergeUp, which merges bags with low-priority bags to enhance inter-category information, increasing training data diversity. Experimental results on the CAMELYON-16, BRACS, and TCGA-LUNG datasets demonstrate the superiority of our method over existing state-of-the-art approaches, affirming its efficacy in WSI classification.
翻译:计算病理学与人工智能的最新进展显著提升了全切片图像(WSI)分类的性能。然而,WSI的十亿像素级分辨率与手动标注的稀缺性带来了巨大挑战。多示例学习(MIL)是一种前景广阔的弱监督学习方法,适用于WSI分类。近期研究表明,采用伪袋增强技术能够促使模型学习多样化的数据,从而提升模型性能。然而,直接继承父样本标签可能因训练过程中的错误标注引入更多噪声。为解决这一问题,我们将WSI分类任务从弱监督学习转化为半弱监督学习,称为SWS-MIL,其中采用自适应伪袋增强(AdaPse)技术,基于阈值策略为数据分配标注与未标注状态。通过"学生-教师"模式,我们引入了一种特征增强技术MergeUp,该方法将高优先级袋与低优先级袋进行融合以增强类别间信息,从而提升训练数据的多样性。在CAMELYON-16、BRACS和TCGA-LUNG数据集上的实验结果表明,我们的方法优于现有最先进技术,证实了其在WSI分类中的有效性。