In real-world applications, one often encounters ambiguously labeled data, where different annotators assign conflicting class labels. Partial-label learning allows training classifiers in this weakly supervised setting, where state-of-the-art methods already show good predictive performance. However, even the best algorithms give incorrect predictions, which can have severe consequences when they impact actions or decisions. We propose a novel risk-consistent partial-label learning algorithm with a reject option, that is, the algorithm can reject unsure predictions. Extensive experiments on artificial and real-world datasets show that our method provides the best trade-off between the number and accuracy of non-rejected predictions when compared to our competitors, which use confidence thresholds for rejecting unsure predictions instead. When evaluated without the reject option, our nearest neighbor-based approach also achieves competitive prediction performance.
翻译:在现实应用中,常会遇到标记模糊的数据,即不同标注者会分配相互冲突的类别标签。部分标记学习允许在这种弱监督环境下训练分类器,当前最先进的方法已展现出良好的预测性能。然而,即使最优算法也会产生错误预测,当这些预测影响行动或决策时可能造成严重后果。我们提出一种新颖的带拒识选项的风险一致部分标记学习算法,即该算法能够拒绝对不确定的预测做出判定。在人工和真实数据集上的大量实验表明,与采用置信度阈值来拒识不确定预测的竞争方法相比,我们的方法在非拒识预测的数量与准确性之间实现了最佳权衡。在不使用拒识选项进行评估时,我们基于最近邻的方法同样取得了具有竞争力的预测性能。