This paper addresses complex challenges in histopathological image analysis through three key contributions. Firstly, it introduces a fast patch selection method, FPS, for whole-slide image (WSI) analysis, significantly reducing computational cost while maintaining accuracy. Secondly, it presents PathDino, a lightweight histopathology feature extractor with a minimal configuration of five Transformer blocks and only 9 million parameters, markedly fewer than alternatives. Thirdly, it introduces a rotation-agnostic representation learning paradigm using self-supervised learning, effectively mitigating overfitting. We also show that our compact model outperforms existing state-of-the-art histopathology-specific vision transformers on 12 diverse datasets, including both internal datasets spanning four sites (breast, liver, skin, and colorectal) and seven public datasets (PANDA, CAMELYON16, BRACS, DigestPath, Kather, PanNuke, and WSSS4LUAD). Notably, even with a training dataset of 6 million histopathology patches from The Cancer Genome Atlas (TCGA), our approach demonstrates an average 8.5% improvement in patch-level majority vote performance. These contributions provide a robust framework for enhancing image analysis in digital pathology, rigorously validated through extensive evaluation. Project Page: https://rhazeslab.github.io/PathDino-Page/
翻译:本文通过三大贡献解决组织病理学图像分析中的复杂挑战。首先,针对全切片图像(WSI)分析,提出快速斑块筛选方法FPS,在维持精度的同时显著降低计算成本。其次,构建轻量级组织病理学特征提取器PathDino,采用仅含5个Transformer模块的最小配置,参数规模仅900万,远低于现有方案。第三,引入基于自监督学习的旋转不变表征学习范式,有效缓解过拟合问题。研究表明,我们的紧凑模型在12个多样化数据集上均超越现有最先进的组织病理学专用视觉Transformer,涵盖4个内部数据集(乳腺、肝脏、皮肤和结直肠)及7个公共数据集(PANDA、CAMELYON16、BRACS、DigestPath、Kather、PanNuke和WSSS4LUAD)。值得注意的是,即便使用来自癌症基因组图谱(TCGA)的600万组织病理学斑块进行训练,本方法在斑块级多数投票性能上仍实现平均8.5%的提升。上述贡献为增强数字病理学图像分析提供了可靠框架,并通过广泛评估得到严格验证。项目页面:https://rhazeslab.github.io/PathDino-Page/