Pan-cancer Histopathology WSI Pre-training with Position-aware Masked Autoencoder

Large-scale pre-training models have promoted the development of histopathology image analysis. However, existing self-supervised methods for histopathology images focus on learning patch features, while there is still a lack of available pre-training models for WSI-level feature learning. In this paper, we propose a novel self-supervised learning framework for pan-cancer WSI-level representation pre-training with the designed position-aware masked autoencoder (PAMA). Meanwhile, we propose the position-aware cross-attention (PACA) module with a kernel reorientation (KRO) strategy and an anchor dropout (AD) mechanism. The KRO strategy can capture the complete semantic structure and eliminate ambiguity in WSIs, and the AD contributes to enhancing the robustness and generalization of the model. We evaluated our method on 6 large-scale datasets from multiple organs for pan-cancer classification tasks. The results have demonstrated the effectiveness of PAMA in generalized and discriminative WSI representation learning and pan-cancer WSI pre-training. The proposed method was also compared with \R{7} WSI analysis methods. The experimental results have indicated that our proposed PAMA is superior to the state-of-the-art methods.The code and checkpoints are available at https://github.com/WkEEn/PAMA.

翻译：大规模预训练模型推动了组织病理学图像分析的发展。然而，现有的组织病理学图像自监督方法主要侧重于学习图像块特征，目前仍缺乏可用于全切片图像（WSI）级别特征学习的预训练模型。本文提出了一种新颖的自监督学习框架，用于泛癌WSI级别表示预训练，并设计了位置感知掩码自编码器（PAMA）。同时，我们提出了位置感知交叉注意力（PACA）模块，该模块包含核重定向（KRO）策略和锚点丢弃（AD）机制。KRO策略能够捕获WSI中完整的语义结构并消除歧义，而AD机制有助于增强模型的鲁棒性和泛化能力。我们在来自多个器官的6个大规模数据集上评估了我们的方法，用于泛癌分类任务。结果证明了PAMA在通用且具有判别力的WSI表示学习以及泛癌WSI预训练方面的有效性。所提出的方法还与\R{7}种WSI分析方法进行了比较。实验结果表明，我们提出的PAMA优于现有最先进的方法。代码和检查点可在 https://github.com/WkEEn/PAMA 获取。

相关内容

自编码器

关注 141

自动编码器是一种人工神经网络，用于以无监督的方式学习有效的数据编码。自动编码器的目的是通过训练网络忽略信号“噪声”来学习一组数据的表示（编码），通常用于降维。与简化方面一起，学习了重构方面，在此，自动编码器尝试从简化编码中生成尽可能接近其原始输入的表示形式，从而得到其名称。基本模型存在几种变体，其目的是迫使学习的输入表示形式具有有用的属性。自动编码器可有效地解决许多应用问题，从面部识别到获取单词的语义。

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日