Re-training a deep learning model each time a single data point receives a new label is impractical due to the inherent complexity of the training process. Consequently, existing active learning (AL) algorithms tend to adopt a batch-based approach where, during each AL iteration, a set of data points is collectively chosen for annotation. However, this strategy frequently leads to redundant sampling, ultimately eroding the efficacy of the labeling procedure. In this paper, we introduce a new AL algorithm that harnesses the power of a Gaussian process surrogate in conjunction with the neural network principal learner. Our proposed model adeptly updates the surrogate learner for every new data instance, enabling it to emulate and capitalize on the continuous learning dynamics of the neural network without necessitating a complete retraining of the principal model for each individual label. Experiments on four benchmark datasets demonstrate that this approach yields significant enhancements, either rivaling or aligning with the performance of state-of-the-art techniques.
翻译:由于训练过程固有的复杂性,每次单个数据点获得新标签时重新训练深度学习模型并不现实。因此,现有主动学习算法通常采用批次方法,即在每次主动学习迭代中,集体选择一组数据点进行标注。然而,这一策略常常导致冗余采样,最终削弱标注流程的效率。本文提出一种新的主动学习算法,该算法利用高斯过程替代模型与神经网络主学习器相结合。我们提出的模型能够针对每个新数据实例高效更新替代学习器,使其模拟并利用神经网络的持续学习动态,而无需为每个单独标签完全重新训练主模型。在四个基准数据集上的实验表明,该方法带来了显著提升,其性能可媲美或对齐当前最先进技术。