Semi-supervised learning (SSL) is a practical challenge in computer vision. Pseudo-label (PL) methods, e.g., FixMatch and FreeMatch, obtain the State Of The Art (SOTA) performances in SSL. These approaches employ a threshold-to-pseudo-label (T2L) process to generate PLs by truncating the confidence scores of unlabeled data predicted by the self-training method. However, self-trained models typically yield biased and high-variance predictions, especially in the scenarios when a little labeled data are supplied. To address this issue, we propose a lightweight channel-based ensemble method to effectively consolidate multiple inferior PLs into the theoretically guaranteed unbiased and low-variance one. Importantly, our approach can be readily extended to any SSL framework, such as FixMatch or FreeMatch. Experimental results demonstrate that our method significantly outperforms state-of-the-art techniques on CIFAR10/100 in terms of effectiveness and efficiency.
翻译:半监督学习(SSL)是计算机视觉中的一项实际挑战。伪标签(PL)方法,例如FixMatch和FreeMatch,在SSL中取得了当前最优(SOTA)性能。这些方法采用阈值-伪标签(T2L)过程,通过截断自训练方法预测的无标签数据的置信度分数来生成伪标签。然而,自训练模型通常产生有偏且高方差的预测,特别是在仅提供少量有标签数据的场景中。为解决这一问题,我们提出了一种轻量级基于通道的集成方法,能够有效将多个劣质伪标签整合为理论上保证无偏且低方差的伪标签。重要的是,我们的方法可轻松扩展至任何SSL框架,例如FixMatch或FreeMatch。实验结果表明,我们的方法在CIFAR10/100数据集上的有效性和效率均显著优于当前最优技术。