Self-supervised pretraining attempts to enhance model performance by obtaining effective features from unlabeled data, and has demonstrated its effectiveness in the field of histopathology images. Despite its success, few works concentrate on the extraction of nucleus-level information, which is essential for pathologic analysis. In this work, we propose a novel nucleus-aware self-supervised pretraining framework for histopathology images. The framework aims to capture the nuclear morphology and distribution information through unpaired image-to-image translation between histopathology images and pseudo mask images. The generation process is modulated by both conditional and stochastic style representations, ensuring the reality and diversity of the generated histopathology images for pretraining. Further, an instance segmentation guided strategy is employed to capture instance-level information. The experiments on 7 datasets show that the proposed pretraining method outperforms supervised ones on Kather classification, multiple instance learning, and 5 dense-prediction tasks with the transfer learning protocol, and yields superior results than other self-supervised approaches on 8 semi-supervised tasks. Our project is publicly available at https://github.com/zhiyuns/UNITPathSSL.
翻译:自监督预训练旨在通过从无标签数据中获取有效特征来提升模型性能,并在组织病理学图像领域展现了其有效性。尽管取得了成功,但现有工作鲜少关注对病理分析至关重要的细胞核层级信息提取。本文提出一种面向组织病理学图像的新型细胞核感知自监督预训练框架。该框架通过组织病理学图像与伪掩模图像间的非配对图像到图像翻译,捕获细胞核形态与分布信息。生成过程由条件化与随机化风格表征共同调控,确保用于预训练的生成图像既具真实性又具多样性。进一步地,采用实例分割引导策略捕获实例层级信息。在7个数据集上的实验表明,本文提出的预训练方法在Kather分类、多实例学习及5个密集预测任务中,采用迁移学习协议时性能优于监督学习方法;在8个半监督任务中,其效果亦超越其他自监督方法。项目开源地址:https://github.com/zhiyuns/UNITPathSSL。