Multiple instance learning (MIL) has become a preferred method for gigapixel whole slide image (WSI) classification without requiring patch-level annotations. Current MIL research primarily relies on embedding-based approaches, which extract patch features using a pre-trained feature extractor and aggregate them for slide-level prediction. Despite the critical role of feature extraction, there is limited guidance on selecting optimal feature extractors to maximize WSI performance. This study addresses this gap by systematically evaluating MIL feature extractors across three dimensions: pre-training dataset, backbone model, and pre-training method. Extensive experiments were conducted on two public WSI datasets (TCGA-NSCLC and Camelyon16) using four state-of-the-art (SOTA) MIL models. Our findings reveal that selecting a robust self-supervised learning (SSL) method has a greater impact on performance than relying solely on an in-domain pre-training dataset. Additionally, prioritizing Transformer-based backbones with deeper architectures over CNN-based models and using larger, more diverse pre-training datasets significantly enhances classification outcomes. We believe these insights provide practical guidance for optimizing WSI classification and explain the reasons behind the performance advantages of current SOTA pathology foundation models. Furthermore, this work may inform the development of more effective foundation models. Our code is publicly available at https://anonymous.4open.science/r/MIL-Feature-Extractor-Selection
翻译:多示例学习(MIL)已成为无需区块级标注的千兆像素全切片图像(WSI)分类的首选方法。当前MIL研究主要依赖基于嵌入的方法,该方法使用预训练特征提取器提取区块特征,并将其聚合以进行切片级预测。尽管特征提取至关重要,但关于如何选择最优特征提取器以最大化WSI性能的指导有限。本研究通过系统评估MIL特征提取器的三个维度:预训练数据集、骨干模型和预训练方法,来填补这一空白。我们使用四种最先进的MIL模型在两个公开WSI数据集(TCGA-NSCLC和Camelyon16)上进行了大量实验。我们的研究结果表明,选择稳健的自监督学习(SSL)方法比仅依赖域内预训练数据集对性能的影响更大。此外,优先选择具有更深架构的基于Transformer的骨干模型而非基于CNN的模型,并使用更大、更多样化的预训练数据集,能显著提升分类结果。我们相信这些见解为优化WSI分类提供了实用指导,并解释了当前最先进的病理学基础模型性能优势背后的原因。此外,这项工作可为开发更有效的基础模型提供参考。我们的代码公开在 https://anonymous.4open.science/r/MIL-Feature-Extractor-Selection。