Based on digital pathology slice scanning technology, artificial intelligence algorithms represented by deep learning have achieved remarkable results in the field of computational pathology. Compared to other medical images, pathology images are more difficult to annotate, and thus, there is an extreme lack of available datasets for conducting supervised learning to train robust deep learning models. In this paper, we propose a self-supervised learning (SSL) model, the global contrast-masked autoencoder (GCMAE), which can train the encoder to have the ability to represent local-global features of pathological images, also significantly improve the performance of transfer learning across data sets. In this study, the ability of the GCMAE to learn migratable representations was demonstrated through extensive experiments using a total of three different disease-specific hematoxylin and eosin (HE)-stained pathology datasets: Camelyon16, NCTCRC and BreakHis. In addition, this study designed an effective automated pathology diagnosis process based on the GCMAE for clinical applications. The source code of this paper is publicly available at https://github.com/StarUniversus/gcmae.
翻译:基于数字病理切片扫描技术,以深度学习为代表的人工智能算法在计算病理学领域取得了显著成果。与其他医学图像相比,病理图像的标注难度更大,因此可用于监督学习以训练稳健深度学习模型的可用数据集极度匮乏。本文提出了一种自监督学习模型——全局对比掩码自编码器(GCMAE),该模型能够训练编码器具备表征病理图像局部-全局特征的能力,并显著提升跨数据集的迁移学习性能。本研究通过使用涵盖Camelyon16、NCTCRC和BreakHi等三种不同疾病特异性苏木精-伊红(HE)染色病理数据集的大量实验,证明了GCMAE学习可迁移表征的能力。此外,本研究基于GCMAE设计了一种面向临床应用的自动化病理诊断流程。本文源代码已公开于https://github.com/StarUniversus/gcmae。