Based on digital pathology slice scanning technology, artificial intelligence algorithms represented by deep learning have achieved remarkable results in the field of computational pathology. Compared to other medical images, pathology images are more difficult to annotate, and thus, there is an extreme lack of available datasets for conducting supervised learning to train robust deep learning models. In this paper, we propose a self-supervised learning (SSL) model, the global contrast-masked autoencoder (GCMAE), which can train the encoder to have the ability to represent local-global features of pathological images, also significantly improve the performance of transfer learning across data sets. In this study, the ability of the GCMAE to learn migratable representations was demonstrated through extensive experiments using a total of three different disease-specific hematoxylin and eosin (HE)-stained pathology datasets: Camelyon16, NCTCRC and BreakHis. In addition, this study designed an effective automated pathology diagnosis process based on the GCMAE for clinical applications. The source code of this paper is publicly available at https://github.com/StarUniversus/gcmae.
翻译:基于数字病理切片扫描技术,以深度学习为代表的人工智能算法已在计算病理学领域取得显著成果。相较于其他医学图像,病理图像标注难度更大,导致可用于监督学习训练鲁棒深度学习模型的数据集极度匮乏。本文提出了一种自监督学习(SSL)模型——全局对比掩码自编码器(GCMAE),该模型能够训练编码器具备表征病理图像局部-全局特征的能力,同时显著提升跨数据集迁移学习的性能。本研究通过使用三个不同疾病特异性苏木精-伊红(HE)染色病理数据集(Camelyon16、NCTCRC和BreakHis)进行大量实验,验证了GCMAE学习可迁移表征的能力。此外,本研究基于GCMAE设计了一套用于临床应用的自动化病理诊断流程。本文源代码已在https://github.com/StarUniversus/gcmae公开。