Multiple-instance Learning (MIL) is commonly used for computational pathology (CPath), where multi-scale features are essential for capturing both fine cellular details and broad tissue architecture. However, existing multi-scale MIL approaches typically rely on the inflexible multi-magnification inputs or the computationally expensive architectures. As pre-trained foundation models (FMs) become the trend for feature extraction and boost lightweight models, we rethink and explore a more efficient multi-scale MIL method. In this paper, we propose the Multi-scale Pyramidal Network (MSPN), a plug-and-play module for attention-based MIL. MSPN introduces progressive multi-scale whole-slide image analysis using only a single high-magnification input. It consists of (1) grid-based remapping that aggregates high-magnification features to derive spatially-aware coarse feature maps, and (2) the Coarse Guidance Network (CGN) that learns coarse contexts. We benchmark MSPN as an add-on module to 4 attention-based frameworks on 5 clinically relevant tasks with 2 foundation models, and a pre-trained MIL framework. Our results demonstrate that MSPN consistently improves MIL across the compared configurations and tasks, while being lightweight and easy-to-use.
翻译:多实例学习(MIL)在计算病理学(CPath)中广泛应用,其中多尺度特征对于捕捉精细细胞细节与宽域组织结构至关重要。然而,现有基于多尺度的MIL方法通常依赖僵化的多倍率输入或计算开销高昂的架构。随着预训练基础模型(FM)成为特征提取的主流趋势并推动轻量化模型发展,我们重新思考并探索更高效的多尺度MIL方法。本文提出多尺度金字塔网络(MSPN),这是一种即插即用的注意力机制MIL模块。MSPN仅需单张高倍率输入即可实现渐进式多尺度全切片图像分析,其包含:(1)基于网格的重映射模块,通过聚合高倍率特征生成空间感知的粗粒度特征图;(2)粗粒度引导网络(CGN),用于学习上下文粗粒度信息。我们在5项临床任务中,将MSPN作为附加模块集成至4种基于注意力的框架,并结合2种基础模型及一个预训练MIL框架进行基准测试。结果表明,MSPN在保持轻量化与易用性的同时,可在所有对比配置与任务中稳定提升MIL性能。