Self-supervised learning (SSL) has emerged as a key technique for training networks that can generalize well to diverse tasks without task-specific supervision. This property makes SSL desirable for computational pathology, the study of digitized images of tissues, as there are many target applications and often limited labeled training samples. However, SSL algorithms and models have been primarily developed in the field of natural images and whether their performance can be improved by adaptation to particular domains remains an open question. In this work, we present an investigation of modifications to SSL for pathology data, specifically focusing on the DINOv2 algorithm. We propose alternative augmentations, regularization functions, and position encodings motivated by the characteristics of pathology images. We evaluate the impact of these changes on several benchmarks to demonstrate the value of tailored approaches.
翻译:自监督学习(SSL)已成为一种关键技术,用于训练能够泛化到各种任务而无需特定任务监督的网络。这一特性使SSL在计算病理学(研究组织数字化图像的学科)中具有吸引力,因为该领域存在众多目标应用,但标注训练样本通常有限。然而,SSL算法和模型主要是在自然图像领域开发的,其性能能否通过针对特定领域的适应而提升仍是一个未解问题。在本工作中,我们研究了针对病理学数据对SSL进行的修改,特别聚焦于DINOv2算法。基于病理图像的特征,我们提出了替代性的数据增强方法、正则化函数和位置编码。我们通过多个基准测试评估了这些改动的影响,以证明定制化方法的价值。