Deep learning is revolutionising pathology, offering novel opportunities in disease prognosis and personalised treatment. Historically, stain normalisation has been a crucial preprocessing step in computational pathology pipelines, and persists into the deep learning era. Yet, with the emergence of feature extractors trained using self-supervised learning (SSL) on diverse pathology datasets, we call this practice into question. In an empirical evaluation of publicly available feature extractors, we find that omitting stain normalisation and image augmentations does not compromise downstream performance, while incurring substantial savings in memory and compute. Further, we show that the top-performing feature extractors are remarkably robust to variations in stain and augmentations like rotation in their latent space. Contrary to previous patch-level benchmarking studies, our approach emphasises clinical relevance by focusing on slide-level prediction tasks in a weakly supervised setting with external validation cohorts. This work represents the most comprehensive robustness evaluation of public pathology SSL feature extractors to date, involving more than 6,000 training runs across nine tasks, five datasets, three downstream architectures, and various preprocessing setups. Our findings stand to streamline digital pathology workflows by minimising preprocessing needs and informing the selection of feature extractors.
翻译:深度学习正在革新病理学,为疾病预后和个性化治疗提供了新的机遇。历史上,染色标准化一直是计算病理学流程中的关键预处理步骤,并延续到了深度学习时代。然而,随着基于自监督学习(SSL)在多样化病理数据集上训练的特征提取器的出现,我们对这一做法提出了质疑。通过对公开可用的特征提取器进行实证评估,我们发现忽略染色标准化和图像增强不会损害下游性能,同时还能显著节省内存和计算资源。此外,我们表明,性能最优的特征提取器在其潜在空间中对染色变化和旋转等增强操作具有显著的鲁棒性。与先前的补丁级基准研究相反,我们的方法通过关注弱监督环境下的切片级预测任务并结合外部验证队列,强调了临床相关性。这项工作是对迄今为止公共病理学SSL特征提取器最全面的鲁棒性评估,涉及超过6,000次训练运行,涵盖九项任务、五个数据集、三种下游架构以及各种预处理设置。我们的发现有望通过最小化预处理需求并为特征提取器的选择提供指导,从而简化数字病理学工作流程。