Annotating multi-class instances is a crucial task in the field of machine learning. Unfortunately, identifying the correct class label from a long sequence of candidate labels is time-consuming and laborious. To alleviate this problem, we design a novel labeling mechanism called stochastic label. In this setting, stochastic label includes two cases: 1) identify a correct class label from a small number of randomly given labels; 2) annotate the instance with None label when given labels do not contain correct class label. In this paper, we propose a novel suitable approach to learn from these stochastic labels. We obtain an unbiased estimator that utilizes less supervised information in stochastic labels to train a multi-class classifier. Additionally, it is theoretically justifiable by deriving the estimation error bound of the proposed method. Finally, we conduct extensive experiments on widely-used benchmark datasets to validate the superiority of our method by comparing it with existing state-of-the-art methods.
翻译:对多类别实例进行标注是机器学习领域中的一项关键任务。然而,从一长串候选标签中识别出正确的类别标签既耗时又费力。为缓解这一问题,我们设计了一种新颖的标注机制,称为随机标签。在此设定下,随机标签包含两种情况:1) 从少量随机给定的标签中识别出正确的类别标签;2) 当给定标签中不包含正确的类别标签时,将实例标注为无标签。本文提出了一种新颖的适用于从这些随机标签中学习的方法。我们获得了一个无偏估计器,利用随机标签中较少的监督信息来训练多类别分类器。此外,通过推导所提方法的估计误差界,该方法的理论合理性得到了验证。最后,我们在广泛使用的基准数据集上进行了大量实验,通过与现有最先进方法的比较,验证了我们方法的优越性。