Whole-slide image (WSI) classification is a challenging task because 1) patches from WSI lack annotation, and 2) WSI possesses unnecessary variability, e.g., stain protocol. Recently, Multiple-Instance Learning (MIL) has made significant progress, allowing for classification based on slide-level, rather than patch-level, annotations. However, existing MIL methods ignore that all patches from normal slides are normal. Using this free annotation, we introduce a semi-supervision signal to de-bias the inter-slide variability and to capture the common factors of variation within normal patches. Because our method is orthogonal to the MIL algorithm, we evaluate our method on top of the recently proposed MIL algorithms and also compare the performance with other semi-supervised approaches. We evaluate our method on two public WSI datasets including Camelyon-16 and TCGA lung cancer and demonstrate that our approach significantly improves the predictive performance of existing MIL algorithms and outperforms other semi-supervised algorithms. We release our code at https://github.com/AITRICS/pathology_mil.
翻译:全切片图像(WSI)分类是一项具有挑战性的任务,原因在于:1)WSI中的图像块缺乏标注;2)WSI存在不必要的变异性,例如染色方案。近期,多实例学习(MIL)取得了显著进展,使得基于切片级别而非图像块级别的标注进行分类成为可能。然而,现有的MIL方法忽略了正常切片中的所有图像块均属于正常类别这一事实。利用这一免费标注,我们引入了一种半监督信号,用于去除切片间变异性的偏差,并捕捉正常图像块中的共同变异因子。由于我们的方法与MIL算法正交,我们在最新提出的MIL算法基础上评估了我们的方法,并与其他半监督方法的性能进行了比较。我们在两个公开的WSI数据集(包括Camelyon-16和TCGA肺癌数据集)上进行了评估,结果表明,我们的方法显著提升了现有MIL算法的预测性能,并优于其他半监督算法。我们将代码发布在https://github.com/AITRICS/pathology_mil。