Partial label learning (PLL) is a typical weakly supervised learning problem in which each instance is associated with a candidate label set, and among which only one is true. However, the assumption that the ground-truth label is always among the candidate label set would be unrealistic, as the reliability of the candidate label sets in real-world applications cannot be guaranteed by annotators. Therefore, a generalized PLL named Unreliable Partial Label Learning (UPLL) is proposed, in which the true label may not be in the candidate label set. Due to the challenges posed by unreliable labeling, previous PLL methods will experience a marked decline in performance when applied to UPLL. To address the issue, we propose a two-stage framework named Unreliable Partial Label Learning with Recursive Separation (UPLLRS). In the first stage, the self-adaptive recursive separation strategy is proposed to separate the training set into a reliable subset and an unreliable subset. In the second stage, a disambiguation strategy is employed to progressively identify the ground-truth labels in the reliable subset. Simultaneously, semi-supervised learning methods are adopted to extract valuable information from the unreliable subset. Our method demonstrates state-of-the-art performance as evidenced by experimental results, particularly in situations of high unreliability.
翻译:部分标签学习(PLL)是一种典型的弱监督学习问题,其中每个实例关联一个候选标签集,且其中仅有一个为真实标签。然而,假设真实标签始终存在于候选标签集中并不现实,因为在现实应用中标注者无法保证候选标签集的可靠性。为此,本文提出一种广义PLL方法——不可靠部分标签学习(UPLL),其中真实标签可能不在候选标签集中。由于不可靠标注带来的挑战,现有PLL方法在应用于UPLL时性能会显著下降。为解决这一问题,我们提出一种名为"基于递归分离的不可靠部分标签学习(UPLLRS)"的两阶段框架。第一阶段提出自适应递归分离策略,将训练集划分为可靠子集与不可靠子集;第二阶段采用消歧策略逐步识别可靠子集中的真实标签,同时利用半监督学习方法从不可靠子集中提取有价值信息。实验结果表明,尤其在高度不可靠场景下,我们的方法展现出最先进的性能。