Semi-supervised learning (SSL) has been widely explored in recent years, and it is an effective way of leveraging unlabeled data to reduce the reliance on labeled data. In this work, we adjust neural processes (NPs) to the semi-supervised image classification task, resulting in a new method named NP-Match. NP-Match is suited to this task for two reasons. Firstly, NP-Match implicitly compares data points when making predictions, and as a result, the prediction of each unlabeled data point is affected by the labeled data points that are similar to it, which improves the quality of pseudo-labels. Secondly, NP-Match is able to estimate uncertainty that can be used as a tool for selecting unlabeled samples with reliable pseudo-labels. Compared with uncertainty-based SSL methods implemented with Monte-Carlo (MC) dropout, NP-Match estimates uncertainty with much less computational overhead, which can save time at both the training and the testing phases. We conducted extensive experiments on five public datasets under three semi-supervised image classification settings, namely, the standard semi-supervised image classification, the imbalanced semi-supervised image classification, and the multi-label semi-supervised image classification, and NP-Match outperforms state-of-the-art (SOTA) approaches or achieves competitive results on them, which shows the effectiveness of NP-Match and its potential for SSL. The codes are at https://github.com/Jianf-Wang/NP-Match
翻译:半监督学习(SSL)近年来被广泛探索,它是利用无标签数据减少对标签数据依赖的有效方式。本工作中,我们将神经过程(NP)调整至半监督图像分类任务,由此提出名为NP-Match的新方法。NP-Match适合此任务有两方面原因:其一,NP-Match在预测时隐式地比较数据点,因此每个无标签数据点的预测结果会受到与其相似的标签数据点的影响,从而提升伪标签质量;其二,NP-Match能够估计不确定性,这可作为选择具有可靠伪标签的无标签样本的工具。与基于蒙特卡洛(MC)丢弃法的不确定性SSL方法相比,NP-Match以极低计算开销估计不确定性,可在训练和测试阶段节省时间。我们在五个公开数据集上进行了大量实验,涵盖三种半监督图像分类设置——标准半监督图像分类、非均衡半监督图像分类以及多标签半监督图像分类。NP-Match在所有设置中均优于现有最先进(SOTA)方法或取得相当结果,展示了其有效性及在半监督学习领域的潜力。代码见https://github.com/Jianf-Wang/NP-Match。