Whole Slide Image (WSI) classification remains a challenge due to their extremely high resolution and the absence of fine-grained labels. Presently, WSI classification is usually regarded as a Multiple Instance Learning (MIL) problem when only slide-level labels are available. MIL methods involve a patch embedding module and a bag-level classification module, but they are prohibitively expensive to be trained in an end-to-end manner. Therefore, existing methods usually train them separately, or directly skip the training of the embedder. Such schemes hinder the patch embedder's access to slide-level semantic labels, resulting in inconsistency within the entire MIL pipeline. To overcome this issue, we propose a novel framework called Iteratively Coupled MIL (ICMIL), which bridges the loss back-propagation process from the bag-level classifier to the patch embedder. In ICMIL, we use category information in the bag-level classifier to guide the patch-level fine-tuning of the patch feature extractor. The refined embedder then generates better instance representations for achieving a more accurate bag-level classifier. By coupling the patch embedder and bag classifier at a low cost, our proposed framework enables information exchange between the two modules, benefiting the entire MIL classification model. We tested our framework on two datasets using three different backbones, and our experimental results demonstrate consistent performance improvements over state-of-the-art MIL methods. The code is available at: https://github.com/Dootmaan/ICMIL.
翻译:全切片图像(WSI)分类因其极高的分辨率和缺乏细粒度标注而面临挑战。当前,在仅有切片级别标注可用时,WSI分类通常被视为多实例学习(MIL)问题。MIL方法包含补丁嵌入模块和包级分类模块,但端到端训练成本过高。因此,现有方法通常分开训练这两个模块,或直接跳过嵌入器的训练。这种方案阻碍了补丁嵌入器获取切片级语义标签,导致整个MIL流程存在不一致性。为解决这一问题,我们提出了一种名为迭代耦合MIL(ICMIL)的新型框架,该框架将损失反向传播过程从包级分类器桥接到补丁嵌入器。在ICMIL中,我们利用包级分类器中的类别信息指导补丁特征提取器的补丁级微调。优化后的嵌入器随后生成更好的实例表征,从而实现更精确的包级分类器。通过以低成本耦合补丁嵌入器和包分类器,我们的框架实现了两个模块之间的信息交换,从而惠及整个MIL分类模型。我们在两个数据集上使用三种不同骨干网络进行了测试,实验结果表明,该方法相较于最先进的MIL方法取得了持续的性能提升。代码地址:https://github.com/Dootmaan/ICMIL。