Digital histopathology whole slide images (WSIs) provide gigapixel-scale high-resolution images that are highly useful for disease diagnosis. However, digital histopathology image analysis faces significant challenges due to the limited training labels, since manually annotating specific regions or small patches cropped from large WSIs requires substantial time and effort. Weakly supervised multiple instance learning (MIL) offers a practical and efficient solution by requiring only bag-level (slide-level) labels, while each bag typically contains multiple instances (patches). Most MIL methods directly use frozen image patch features generated by various image encoders as inputs and primarily focus on feature aggregation. However, feature representation learning for encoder pretraining in MIL settings has largely been neglected. In our work, we propose a novel feature representation learning framework called weakly supervised contrastive learning (WeakSupCon) that incorporates bag-level label information during training. Our method does not rely on instance-level pseudo-labeling, yet it effectively separates patches with different labels in the feature space. Experimental results demonstrate that the image features generated by our WeakSupCon method lead to improved downstream MIL performance compared to self-supervised contrastive learning approaches in three datasets. Our related code is available at github.com/BzhangURU/Paper_WeakSupCon_for_MIL
翻译:数字组织病理学全切片图像(WSIs)提供千兆像素级别的高分辨率图像,对疾病诊断具有重要价值。然而,由于训练标签有限,数字组织病理学图像分析面临重大挑战——从大型WSIs中手动标注特定区域或裁剪的小图像块需要耗费大量时间和精力。弱监督多示例学习(MIL)通过仅需包级别(切片级别)标签提供了实用高效的解决方案,而每个包通常包含多个实例(图像块)。大多数MIL方法直接使用各种图像编码器生成的冻结图像块特征作为输入,主要关注特征聚合。然而,在MIL设置中用于编码器预训练的特征表示学习在很大程度上被忽视了。在本研究中,我们提出了一种名为弱监督对比学习(WeakSupCon)的新型特征表示学习框架,该框架在训练过程中融入了包级别标签信息。我们的方法不依赖于实例级别的伪标签,却能有效在特征空间中分离具有不同标签的图像块。实验结果表明,在三个数据集中,与自监督对比学习方法相比,通过我们的WeakSupCon方法生成的图像特征能够提升下游MIL任务的性能。相关代码已发布于github.com/BzhangURU/Paper_WeakSupCon_for_MIL