Deep Learning based diagnostics systems can provide accurate and robust quantitative analysis in digital pathology. These algorithms require large amounts of annotated training data which is impractical in pathology due to the high resolution of histopathological images. Hence, self-supervised methods have been proposed to learn features using ad-hoc pretext tasks. The self-supervised training process is time consuming and often leads to subpar feature representation due to a lack of constrain on the learnt feature space, particularly prominent under data imbalance. In this work, we propose to actively sample the training set using a handful of labels and a small proxy network, decreasing sample requirement by 93% and training time by 99%.
翻译:基于深度学习的诊断系统能够在数字病理学中提供准确且稳健的定量分析。然而,这些算法需要大量标注训练数据,而由于组织病理学图像的高分辨率特性,这在病理学中难以实现。因此,研究者提出了利用特定预文任务进行特征学习的自监督方法。自监督训练过程耗时较长,且由于所学特征空间缺乏约束(特别是在数据不平衡情况下尤为突出),往往导致特征表示欠佳。本研究提出一种方法:利用少量标签和小型代理网络对训练集进行主动采样,使样本需求减少93%,训练时间缩短99%。