Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations. Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance. To solve this problem, the mainstream method developed an effective thresholding strategy to generate accurate pseudo-labels. Unfortunately, the method neglected the quality of model predictions and its potential impact on pseudo-labeling performance. In this paper, we propose a dual-perspective method to generate high-quality pseudo-labels. To improve the quality of model predictions, we perform dual-decoupling to boost the learning of correlative and discriminative features, while refining the generation and utilization of pseudo-labels. To obtain proper class-wise thresholds, we propose the metric-adaptive thresholding strategy to estimate the thresholds, which maximize the pseudo-label performance for a given metric on labeled data. Experiments on multiple benchmark datasets show the proposed method can achieve the state-of-the-art performance and outperform the comparative methods with a significant margin.
翻译:半监督多标签学习(SSMLL)是一种利用未标注数据以降低精确多标签标注收集成本的强大框架。与半监督学习不同,由于一个实例包含多个语义,SSMLL 中无法选择最可能的标签作为伪标签。为解决此问题,主流方法开发了有效的阈值策略以生成准确的伪标签。然而,该方法忽略了模型预测的质量及其对伪标签性能的潜在影响。本文提出一种双视角方法来生成高质量的伪标签。为提升模型预测质量,我们通过双解耦操作增强相关性与判别性特征的学习,同时优化伪标签的生成与利用。为获得合适的类别级阈值,我们提出度量自适应阈值化策略来估计阈值,该策略能在标注数据上最大化给定度量下的伪标签性能。在多个基准数据集上的实验表明,所提方法能够取得最先进的性能,并以显著优势超越对比方法。