Classification in Histopathology: A unique deep embeddings extractor for multiple classification tasks

In biomedical imaging, deep learning-based methods are state-of-the-art for every modality (virtual slides, MRI, etc.) In histopathology, these methods can be used to detect certain biomarkers or classify lesions. However, such techniques require large amounts of data to train high-performing models which can be intrinsically difficult to acquire, especially when it comes to scarce biomarkers. To address this challenge, we use a single, pre-trained, deep embeddings extractor to convert images into deep features and train small, dedicated classification head on these embeddings for each classification task. This approach offers several benefits such as the ability to reuse a single pre-trained deep network for various tasks; reducing the amount of labeled data needed as classification heads have fewer parameters; and accelerating training time by up to 1000 times, which allows for much more tuning of the classification head. In this work, we perform an extensive comparison of various open-source backbones and assess their fit to the target histological image domain. This is achieved using a novel method based on a proxy classification task. We demonstrate that thanks to this selection method, an optimal feature extractor can be selected for different tasks on the target domain. We also introduce a feature space augmentation strategy which proves to substantially improve the final metrics computed for the different tasks considered. To demonstrate the benefit of such backbone selection and feature-space augmentation, our experiments are carried out on three separate classification tasks and show a clear improvement on each of them: microcalcifications (29.1% F1-score increase), lymph nodes metastasis (12.5% F1-score increase), mitosis (15.0% F1-score increase).

翻译：在生物医学成像领域，基于深度学习的方法已成为各类模态（虚拟切片、MRI等）的最新技术。在组织病理学中，这些方法可用于检测特定生物标志物或对病变进行分类。然而，此类技术需要大量数据来训练高性能模型，而获取这些数据本身具有固有一定难度，尤其是在涉及稀缺生物标志物时。为应对这一挑战，我们采用单一预训练的深度嵌入特征提取器将图像转换为深度特征，并为每个分类任务训练专用的小型分类头。该方法具备多重优势：可复用单一预训练深度网络处理不同任务；由于分类头参数量减少，所需标注数据量随之降低；训练速度提升高达1000倍，从而允许对分类头进行更充分的调优。本研究对多种开源骨干网络进行了广泛比较，并评估其与目标组织学图像领域的适配性。我们基于代理分类任务提出了一种新型评估方法，结果表明通过该选择方法，可为目标领域的不同任务选取最优特征提取器。同时引入特征空间增强策略，显著提升了各任务最终评估指标。为验证骨干网络选择与特征空间增强的效益，我们在三项独立分类任务上开展实验，每项任务均取得显著改进：微钙化灶（F1分数提升29.1%）、淋巴结转移（F1分数提升12.5%）、有丝分裂（F1分数提升15.0%）。