Due to the advantages of leveraging unlabeled data and learning meaningful representations, semi-supervised learning and contrastive learning have been progressively combined to achieve better performances in popular applications with few labeled data and abundant unlabeled data. One common manner is assigning pseudo-labels to unlabeled samples and selecting positive and negative samples from pseudo-labeled samples to apply contrastive learning. However, the real-world data may be imbalanced, causing pseudo-labels to be biased toward the majority classes and further undermining the effectiveness of contrastive learning. To address the challenge, we propose Contrastive Learning with Augmented Features (CLAF). We design a class-dependent feature augmentation module to alleviate the scarcity of minority class samples in contrastive learning. For each pseudo-labeled sample, we select positive and negative samples from labeled data instead of unlabeled data to compute contrastive loss. Comprehensive experiments on imbalanced image classification datasets demonstrate the effectiveness of CLAF in the context of imbalanced semi-supervised learning.
翻译:由于半监督学习和对比学习在利用无标签数据及学习有意义的表示方面具有优势,两者已逐渐结合,以在标注数据稀少、无标签数据丰富的实际应用中取得更优性能。常见做法是为无标签样本分配伪标签,并从伪标签样本中选择正负样本应用对比学习。然而,真实世界数据可能存在不平衡现象,导致伪标签偏向多数类,进而削弱对比学习的有效性。为解决这一挑战,我们提出增强特征对比学习(CLAF)。我们设计了一个类别依赖的特征增强模块,以缓解对比学习中少数类样本稀缺的问题。对于每个伪标签样本,我们从标注数据而非无标签数据中选择正负样本来计算对比损失。在不平衡图像分类数据集上的综合实验表明,CLAF在不平衡半监督学习场景中具有显著有效性。