Accurate tumor detection in digital pathology whole-slide images (WSIs) is crucial for cancer diagnosis and treatment planning. Multiple Instance Learning (MIL) has emerged as a widely used approach for weakly-supervised tumor detection with large-scale data without the need for manual annotations. However, traditional MIL methods often depend on classification tasks that require tumor-free cases as negative examples, which are challenging to obtain in real-world clinical workflows, especially for surgical resection specimens. We address this limitation by reformulating tumor detection as a regression task, estimating tumor percentages from WSIs, a clinically available target across multiple cancer types. In this paper, we provide an analysis of the proposed weakly-supervised regression framework by applying it to multiple organs, specimen types and clinical scenarios. We characterize the robustness of our framework to tumor percentage as a noisy regression target, and introduce a novel concept of amplification technique to improve tumor detection sensitivity when learning from small tumor regions. Finally, we provide interpretable insights into the model's predictions by analyzing visual attention and logit maps. Our code is available at https://github.com/DIAGNijmegen/tumor-percentage-mil-regression.
翻译:数字病理全切片图像中的精确肿瘤检测对于癌症诊断与治疗规划至关重要。多示例学习已成为一种广泛应用的弱监督肿瘤检测方法,能够利用大规模数据而无需人工标注。然而,传统MIL方法通常依赖于需要无肿瘤病例作为负样本的分类任务,这在真实临床工作流程中(特别是对于手术切除标本)难以获取。我们通过将肿瘤检测重新构建为回归任务来解决这一局限,即从WSIs中估计肿瘤百分比——这是一个在多种癌症类型中临床可获取的目标。本文通过将所提出的弱监督回归框架应用于多个器官、标本类型和临床场景,对其进行了系统性分析。我们刻画了该框架对作为噪声回归目标的肿瘤百分比的鲁棒性,并引入了一种新颖的放大技术概念,以提升从小肿瘤区域学习时的肿瘤检测灵敏度。最后,我们通过分析视觉注意力图和对数几率图,为模型预测提供了可解释的洞察。我们的代码公开于https://github.com/DIAGNijmegen/tumor-percentage-mil-regression。