Partial label learning (PLL) is a typical weakly supervised learning problem in which each instance is associated with a candidate label set, and among which only one is true. However, the assumption that the ground-truth label is always among the candidate label set would be unrealistic, as the reliability of the candidate label sets in real-world applications cannot be guaranteed by annotators. Therefore, a generalized PLL named Unreliable Partial Label Learning (UPLL) is proposed, in which the true label may not be in the candidate label set. Due to the challenges posed by unreliable labeling, previous PLL methods will experience a marked decline in performance when applied to UPLL. To address the issue, we propose a two-stage framework named Unreliable Partial Label Learning with Recursive Separation (UPLLRS). In the first stage, the self-adaptive recursive separation strategy is proposed to separate the training set into a reliable subset and an unreliable subset. In the second stage, a disambiguation strategy is employed to progressively identify the ground-truth labels in the reliable subset. Simultaneously, semi-supervised learning methods are adopted to extract valuable information from the unreliable subset. Our method demonstrates state-of-the-art performance as evidenced by experimental results, particularly in situations of high unreliability. Code and supplementary materials are available at https://github.com/dhiyu/UPLLRS.
翻译:部分标签学习(PLL)是一种典型的弱监督学习问题,其中每个实例关联一个候选标签集,且其中仅有一个真实标签。然而,假设真实标签始终属于候选标签集在实际应用中并不现实,因为现实场景中注释者无法保证候选标签集的可靠性。为此,本文提出了一种名为不可靠部分标签学习(UPLL)的广义PLL,其中真实标签可能不在候选标签集中。由于不可靠标签带来的挑战,现有PLL方法在应用于UPLL时性能会显著下降。为解决这一问题,我们提出了一种名为基于递归分离的不可靠部分标签学习(UPLLRS)的两阶段框架。第一阶段提出自适应递归分离策略,将训练集划分为可靠子集和不可靠子集;第二阶段采用消歧策略逐步识别可靠子集中的真实标签,同时利用半监督学习方法从不可靠子集中提取有价值信息。实验结果表明,我们的方法在具有高不可靠性的场景下尤其展现出最先进的性能。代码与补充材料见https://github.com/dhiyu/UPLLRS。