Representation learning of pathology whole-slide images(WSIs) has primarily relied on weak supervision with Multiple Instance Learning (MIL). This approach leads to slide representations highly tailored to a specific clinical task. Self-supervised learning (SSL) has been successfully applied to train histopathology foundation models (FMs) for patch embedding generation. However, generating patient or slide level embeddings remains challenging. Existing approaches for slide representation learning extend the principles of SSL from patch level learning to entire slides by aligning different augmentations of the slide or by utilizing multimodal data. By integrating tile embeddings from multiple FMs, we propose a new single modality SSL method in feature space that generates useful slide representations. Our contrastive pretraining strategy, called COBRA, employs multiple FMs and an architecture based on Mamba-2. COBRA exceeds performance of state-of-the-art slide encoders on four different public Clinical Protemic Tumor Analysis Consortium (CPTAC) cohorts on average by at least +4.5% AUC, despite only being pretrained on 3048 WSIs from The Cancer Genome Atlas (TCGA). Additionally, COBRA is readily compatible at inference time with previously unseen feature extractors. Code available at https://github.com/KatherLab/COBRA.
翻译:病理学全玻片图像(WSI)的表征学习主要依赖于多实例学习(MIL)的弱监督方法。这种方法产生的玻片表征高度针对特定的临床任务。自监督学习(SSL)已成功应用于训练组织病理学基础模型(FM)以生成图像块嵌入。然而,生成患者或玻片级别的嵌入仍然具有挑战性。现有的玻片表征学习方法通过对齐玻片的不同增强版本或利用多模态数据,将SSL的原理从图像块级别学习扩展到整个玻片。通过整合来自多个基础模型的图像块嵌入,我们提出了一种在特征空间中进行的新型单模态SSL方法,该方法能生成有用的玻片表征。我们的对比预训练策略名为COBRA,它采用多个基础模型和基于Mamba-2的架构。尽管仅在癌症基因组图谱(TCGA)的3048张WSI上进行预训练,COBRA在四个不同的公共临床蛋白质组肿瘤分析联盟(CPTAC)队列上的平均性能超过了最先进的玻片编码器,AUC至少高出+4.5%。此外,COBRA在推理时能够轻松兼容先前未见过的特征提取器。代码可在 https://github.com/KatherLab/COBRA 获取。