In computational pathology, whole-slide image (WSI) classification presents a formidable challenge due to its gigapixel resolution and limited fine-grained annotations. Multiple-instance learning (MIL) offers a weakly supervised solution, yet refining instance-level information from bag-level labels remains challenging. While most of the conventional MIL methods use attention scores to estimate instance importance scores (IIS) which contribute to the prediction of the slide labels, these often lead to skewed attention distributions and inaccuracies in identifying crucial instances. To address these issues, we propose a new approach inspired by cooperative game theory: employing Shapley values to assess each instance's contribution, thereby improving IIS estimation. The computation of the Shapley value is then accelerated using attention, meanwhile retaining the enhanced instance identification and prioritization. We further introduce a framework for the progressive assignment of pseudo bags based on estimated IIS, encouraging more balanced attention distributions in MIL models. Our extensive experiments on CAMELYON-16, BRACS, TCGA-LUNG, and TCGA-BRCA datasets show our method's superiority over existing state-of-the-art approaches, offering enhanced interpretability and class-wise insights. Our source code is available at https://github.com/RenaoYan/PMIL.
翻译:在计算病理学中,全切片图像(WSI)分类因其千兆像素级分辨率与有限的细粒度标注而构成一项艰巨挑战。多示例学习(MIL)提供了一种弱监督解决方案,但从袋级标签中提炼实例级信息仍具难度。尽管传统MIL方法多采用注意力分数来估计对切片标签预测具有贡献的实例重要性分数(IIS),但这些方法常导致注意力分布偏斜及关键实例识别不准确。为解决这些问题,我们受合作博弈论启发提出一种新方法:利用Shapley值评估各实例的贡献度,从而改进IIS估计。通过注意力机制加速Shapley值计算,同时保持增强的实例识别与优先级排序能力。我们进一步提出基于估计IIS的渐进式伪袋分配框架,促使MIL模型获得更均衡的注意力分布。在CAMELYON-16、BRACS、TCGA-LUNG和TCGA-BRCA数据集上的大量实验表明,本方法优于现有先进技术,在可解释性与类别特异性分析方面具有显著优势。源代码发布于https://github.com/RenaoYan/PMIL。