Partial label learning (PLL) aims to train multiclass classifiers from the examples each annotated with a set of candidate labels where a fixed but unknown candidate label is correct. In the last few years, the instance-independent generation process of candidate labels has been extensively studied, on the basis of which many theoretical advances have been made in PLL. Nevertheless, the candidate labels are always instance-dependent in practice and there is no theoretical guarantee that the model trained on the instance-dependent PLL examples can converge to an ideal one. In this paper, a theoretically grounded and practically effective approach named POP, i.e. PrOgressive Purification for instance-dependent partial label learning, is proposed. Specifically, POP updates the learning model and purifies each candidate label set progressively in every epoch. Theoretically, we prove that POP enlarges the region appropriately fast where the model is reliable, and eventually approximates the Bayes optimal classifier with mild assumptions. Technically, POP is flexible with arbitrary PLL losses and could improve the performance of the previous PLL losses in the instance-dependent case. Experiments on the benchmark datasets and the real-world datasets validate the effectiveness of the proposed method.
翻译:部分标签学习(PLL)旨在从每个样本标注有一组候选标签(其中存在一个固定但未知的正确候选标签)的示例中训练多类分类器。近年来,候选标签的实例无关生成过程得到了广泛研究,并在此基础上推动了PLL领域的诸多理论进展。然而,实际场景中的候选标签始终具有实例依赖性,且目前尚无理论保证在实例依赖型PLL示例上训练的模型能够收敛到理想分类器。本文提出了一种具有理论根基且实际有效的方法——POP(渐进式净化,PrOgressive Purification),用于实例依赖的部分标签学习。具体而言,POP在每个训练周期中逐步更新学习模型并净化每个候选标签集。理论上,我们证明POP能以适当速度快速扩展模型可靠的区域,并在温和假设下最终逼近贝叶斯最优分类器。技术层面,POP与任意PLL损失函数兼容,并能在实例依赖场景下提升现有PLL损失函数的性能。在基准数据集和真实世界数据集上的实验验证了所提方法的有效性。