Existing deep learning models have achieved promising performance in recognizing skin diseases from dermoscopic images. However, these models can only recognize samples from predefined categories, when they are deployed in the clinic, data from new unknown categories are constantly emerging. Therefore, it is crucial to automatically discover and identify new semantic categories from new data. In this paper, we propose a new novel class discovery framework for automatically discovering new semantic classes from dermoscopy image datasets based on the knowledge of known classes. Specifically, we first use contrastive learning to learn a robust and unbiased feature representation based on all data from known and unknown categories. We then propose an uncertainty-aware multi-view cross pseudo-supervision strategy, which is trained jointly on all categories of data using pseudo labels generated by a self-labeling strategy. Finally, we further refine the pseudo label by aggregating neighborhood information through local sample similarity to improve the clustering performance of the model for unknown categories. We conducted extensive experiments on the dermatology dataset ISIC 2019, and the experimental results show that our approach can effectively leverage knowledge from known categories to discover new semantic categories. We also further validated the effectiveness of the different modules through extensive ablation experiments. Our code will be released soon.
翻译:现有深度学习模型在识别皮肤镜图像中的皮肤疾病方面已取得显著成效。然而,这些模型仅能识别预定义类别的样本,当在临床中部署时,来自未知新类别的数据不断涌现。因此,从新数据中自动发现并识别新的语义类别至关重要。本文提出一种新的新类别发现框架,基于已知类别知识自动从皮肤镜图像数据集中发现新的语义类别。具体而言,我们首先利用对比学习,基于已知和未知类别的全部数据学习稳健且无偏的特征表示。随后提出一种不确定性感知的多视图交叉伪监督策略,该策略通过自标签策略生成伪标签,联合训练所有类别的数据。最后,通过局部样本相似性聚合邻域信息进一步优化伪标签,以提升模型对未知类别的聚类性能。我们在皮肤病学数据集ISIC 2019上进行了广泛实验,结果表明该方法能有效利用已知类别知识发现新的语义类别。同时,通过大量消融实验进一步验证了各模块的有效性。我们的代码将很快公开。