Benchmarking Embedding Aggregation Methods in Computational Pathology: A Clinical Data Perspective

Shengjia Chen,Gabriele Campanella,Abdulkadir Elmas,Aryeh Stock,Jennifer Zeng,Alexandros D. Polydorides,Adam J. Schoenfeld,Kuan-lin Huang,Jane Houldsworth,Chad Vanderbilt,Thomas J. Fuchs

from arxiv, 10 pages, 2 figures

Recent advances in artificial intelligence (AI), in particular self-supervised learning of foundation models (FMs), are revolutionizing medical imaging and computational pathology (CPath). A constant challenge in the analysis of digital Whole Slide Images (WSIs) is the problem of aggregating tens of thousands of tile-level image embeddings to a slide-level representation. Due to the prevalent use of datasets created for genomic research, such as TCGA, for method development, the performance of these techniques on diagnostic slides from clinical practice has been inadequately explored. This study conducts a thorough benchmarking analysis of ten slide-level aggregation techniques across nine clinically relevant tasks, including diagnostic assessment, biomarker classification, and outcome prediction. The results yield following key insights: (1) Embeddings derived from domain-specific (histological images) FMs outperform those from generic ImageNet-based models across aggregation methods. (2) Spatial-aware aggregators enhance the performance significantly when using ImageNet pre-trained models but not when using FMs. (3) No single model excels in all tasks and spatially-aware models do not show general superiority as it would be expected. These findings underscore the need for more adaptable and universally applicable aggregation techniques, guiding future research towards tools that better meet the evolving needs of clinical-AI in pathology. The code used in this work is available at \url{https://github.com/fuchs-lab-public/CPath_SABenchmark}.

翻译：人工智能（AI）的最新进展，特别是基础模型（FMs）的自监督学习，正在彻底改变医学影像和计算病理学（CPath）领域。在分析数字全切片图像（WSIs）时，一个持续的挑战是如何将成千上万的图块级图像嵌入聚合为切片级表示。由于方法开发普遍使用为基因组研究创建的数据集（如TCGA），这些技术在临床实践诊断切片上的性能尚未得到充分探索。本研究对十种切片级聚合技术在九项临床相关任务上进行了全面的基准测试分析，这些任务包括诊断评估、生物标志物分类和结果预测。结果得出以下关键见解：（1）在各类聚合方法中，源自领域特定（组织学图像）FMs的嵌入表现优于基于通用ImageNet的模型。（2）当使用ImageNet预训练模型时，空间感知聚合器能显著提升性能，但在使用FMs时则无此效果。（3）没有单一模型在所有任务中都表现出色，且空间感知模型并未如预期那样展现出普遍优越性。这些发现强调了开发更具适应性、更普遍适用的聚合技术的必要性，为未来研究指明了方向，以开发出更好地满足病理学临床-AI不断演进需求的工具。本工作中使用的代码可在 \url{https://github.com/fuchs-lab-public/CPath_SABenchmark} 获取。

相关内容

MoDELS

关注 46

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/