In Multi-Label Learning (MLL), it is extremely challenging to accurately annotate every appearing object due to expensive costs and limited knowledge. When facing such a challenge, a more practical and cheaper alternative should be Single Positive Multi-Label Learning (SPMLL), where only one positive label needs to be provided per sample. Existing SPMLL methods usually assume unknown labels as negatives, which inevitably introduces false negatives as noisy labels. More seriously, Binary Cross Entropy (BCE) loss is often used for training, which is notoriously not robust to noisy labels. To mitigate this issue, we customize an objective function for SPMLL by pushing only one pair of labels apart each time to prevent the domination of negative labels, which is the main culprit of fitting noisy labels in SPMLL. To further combat such noisy labels, we explore the high-rankness of label matrix, which can also push apart different labels. By directly extending from SPMLL to MLL with full labels, a unified loss applicable to both settings is derived. Experiments on real datasets demonstrate that the proposed loss not only performs more robustly to noisy labels for SPMLL but also works well for full labels. Besides, we empirically discover that high-rankness can mitigate the dramatic performance drop in SPMLL. Most surprisingly, even without any regularization or fine-tuned label correction, only adopting our loss defeats state-of-the-art SPMLL methods on CUB, a dataset that severely lacks labels.
翻译:在多标签学习(MLL)中,由于高昂成本和有限知识,准确标注每个出现的目标极具挑战性。面对这一挑战,一个更实际且成本更低的替代方案是单正例多标签学习(SPMLL),其中每个样本仅需提供一个正标签。现有的SPMLL方法通常将未知标签视为负标签,这不可避免地会引入作为噪声标签的假负例。更严重的是,常采用对噪声标签鲁棒性较差的二元交叉熵(BCE)损失进行训练。为缓解此问题,我们为SPMLL定制了一个目标函数,通过每次仅推远一对标签,以防止负标签主导损失,这是SPMLL中拟合噪声标签的主要原因。为进一步对抗此类噪声标签,我们探索了标签矩阵的高秩性,这也能推远不同标签。通过将SPMLL直接扩展至具有全标签的MLL,我们推导出适用于两种设置的统一损失。在真实数据集上的实验表明,所提出的损失不仅在SPMLL中对噪声标签更具鲁棒性,而且在全标签场景下也表现良好。此外,我们实证发现高秩性可缓解SPMLL中性能的急剧下降。最令人惊讶的是,即便无任何正则化或精细调优的标签校正,仅采用我们的损失即可在严重缺乏标签的CUB数据集上超越最先进的SPMLL方法。